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Editorial 
Message from Managing Editor 



The International Journal of Computer Science and Information Security (IJCSIS) is a refereed, 
international publication featuring the latest research findings and industry solutions involving all 
aspects of computing and security. The editorial board is pleased to present the June 2015 issue. 
The purpose of this edition is to disseminate experimental and theoretical research from both 
industry and academia in the broad areas of Computer Science, ICT & Security and further bring 
together people who work in the relevant areas. As the editors of this issue, we are glad to see 
variety of articles focusing on the major topics of innovation and computer science; computer 
security, interdisciplinary applications, information technologies etc. This journal promotes 
excellent research publications which offer significant contribution to the computer science 
knowledge and which are of high interest to a wide academic/research/practitioner audience. 

Over the last five years, we have witnessed significant growth of IJCSIS in several key areas, 
include the expansion of scope to recruit papers from emerging areas of green & sustainable 
computing, cloud computing security, forensics, mobile computing and big data analytics. IJCSIS 
archives all publications in major academic/scientific databases and is indexed by the following 
International agencies and institutions: Google Scholar, CiteSeerX, Cornell's University Library, Ei 
Compendex, Scopus, DBLP, DOAJ, ProQuest, ArXiv, ResearchGate and EBSCO. 

We are indebted to the wonderful team of publication staff members, associate editors, and 
reviewers for their dedicated services to select and publish extremely high quality papers for 
publication in IJCSIS. In particular, I would like to thank all associate editors who have answered 
the frequent calls to process the papers assigned to them in a timely fashion. I would also like to 
thank the authors for submitting their high quality papers to IJCSIS and the readers for continued 
support to IJCSIS by citing papers published in IJCSIS. Without their continued and unselfish 
commitments, IJCSIS would not have achieved its current premier status. 

We support researchers to succeed by providing high visibility & impact value, prestige and 
efficient publication process & service. 

For further questions please do not hesitate to contact us at ijcsiseditor@cimail.com . 

A complete list of journals can be found at: 
http://sites.google.com/site/iicsis/ 
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EA encryption method to encrypt the plain-OTP to cipher-OTP. Then, Quick Response Code (QR) code is used as a 
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methods are used for security and identification purpose, but they can capture only by physical control or at a close 
distance from record search. Gait on a behavioral biometric has attracted more attention recently because it can 
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method of identification of Vrutta in Sanskrit Shloka and suggests the musical notations based on identified Vrutta, 
for singing the Shloka. The designed system "Sangit Vrutta Darshika" can be used as a guide to learn the 
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Abstract — This paper represents designing & analysis of high bandwidth Connected E-H and E shaped microstrip 
patch antennas. RT Duroid 5880 dielectric substrate material is used to design these antenna. A simulation tool, 
Sonnet Suites, a planar 3D electromagnetic simulator is used in this work. To fed patch antennas, co-axial probe 
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the center frequency. The result shows that return loss is under -lOdB. Applications for proposed antennas are 
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yet to receive the attention of the researchers as deserved. In other words, video and multimedia documents are 
exposed to unauthorized accessors. The authors propose image encryption using matrix transpose. An algorithm that 
would allow image encryption is developed. In this proposed image encryption technique, the image to be encrypted 



is split into parts based on the image size. Each part is encrypted separately using matrix transpose. The actual 
encryption is on the picture elements (pixel) that make up the image. After encrypting each part of the image, the 
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Abstract — Handheld device systems have been used as tools for teaching people with special needs due to 
cognitive function enhancement by utility of multimedia, attractive graphics and user-friendly navigation. Can a 
handheld device system, such as cellular phone, be used for teaching illiterate people? This paper explores and 
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Abstract - Friendly interface is necessary to make the system more efficient and effective. The development of Urdu 
recognition is key element of research as it provides an efficient and natural way of input to the computer. This 
paper presents a framework based on Urdu layout and recognition of handwritten digits and text images by using 
different techniques. After the survey on Urdu documents the following conclusion is made regarding the Data set, 
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Abstract - Mapping the virtual machines to the physical machines cluster is called the VM placement. Placing the 
VM in the appropriate host is necessary for ensuring the effective resource utilization and minimizing the datacenter 
cost as well as power. Here we present an efficient hybrid genetic based host load aware algorithm for scheduling 
and optimization of virtual machines in a cluster of Physical hosts. We developed the algorithm based on two 
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Index Terms: Virtual Machine, Physical Machine Cluster, VM Scheduling, Load Rebalancing, Load Monitoring. 

16. Paper 31031501: Biometric Bank Account Verification System In Nigerian: Challenges And 
Opportunities (pp. 103-117) 

Omogbhemhe hah Mike, Department Of Computer Science, Ambrose Alii University, Ekpoma Edo State Nigeria 
Ibrahim Bayo Momodu, Department Of Computer Science, Ambrose Alii University, Ekpoma Edo State Nigeria 

Abstract - Due to the need for strong security for customer financial information in the banking sector, the sector has 
started the introduction of biometric fingerprint measures in providing securities for banking systems and software. 
In this paper, we have carefully explained the methodology of using this technology in banking sectors for customer 
verification and authentication. The challenges and opportunities associated with this technology were also 
discussed in this paper. 
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Abstract — Most biometric authentication methods have 
been developed under the assumption that the extracted fea- 
tures that participate in the authentication process are fixed. 
But the quality and accessibility of biometric features face 
challenges due to position orientation, illumination, and facial 
expression effects. This paper addresses the predominant 
deficiencies in this regard and systematically investigates a 
facial authentication system in the variable features' domain. 
In this method, the extracted features are considered to be 
variable and selected based on their quality and accessibility. 
Furthermore, the Euclidean geometry in 2-D computational 
vector space is being constructed for features extraction. Af- 
terwards, algebraic shapes of the features are computed and 
compared. The proposed method is being tested on images 
from two public databases: the "Put Face Database" and 
the "Indian Face Database". Performance is evaluated based 
on the Correct Recognition (CRR) and Equal Error (EER) 
rates. The theoretical foundation of the proposed method 
along with the experimental results are also presented in this 
paper. The results obtained in the experiment demonstrate 
the effectiveness of the proposed method. 

Index Terms — CRR, EER, Euclidean geometry, and facial 
biometric. 

I. Introduction 

The rapid evolution of information technology has 
caused the traditional token-based authentication and se- 
curity management system to no longer be sophisticated 
enough to handle the challenges of the 21 st century. As 
a result, biometrics has emerged as the most reasonable, 
efficient, and ultimate solution to authenticate the legiti- 
macy of an individual [1-3]. Biometrics is an automated 
method of authenticating an individual based on their 
measurable physiological and behavioural characteristics. 
The common biometric traits in this characterization pro- 
cess are fingerprint, face, iris, hand geometry, gait, voice, 
signature, and keystrokes [1],[2]. Fingerprint, face, and iris 
traits are widely used in the field of biometric technology. 
Government and law enforcement organizations including 
military, civil aviation, and secret service often need to 
track and authenticate dynamic targets under surveillance. 
Organizations are also required to ensure that an individual 
in a room or crowd is the same person who had entered it. 



As a result, a step in the direction of facial biometrics 
is regarded as a conclusive solution in this area. This 
technology makes it possible to facilitate the extraction of 
unique and undeniable physiological and behavioural char- 
acteristics without having the target's (subject) intrusion or 
knowledge [1-4]. 

There are many different methodologies that have been 
studied for biometric authentication systems, including 
shape of the facial features, skin color, and appearance. 
Among them, the feature-based method is the most effi- 
cient due to its measurability, universality, uniqueness, and 
accuracy. This approach is becoming the foundation of an 
extensive array of highly secure identification and personal 
verification solutions. The most commonly used facial fea- 
tures are the nose, eyes, lips, chin, eyebrows, and ears [5]. 
The system's performance and robustness are largely de- 
pendent on the features localization and extraction process. 
This process can be defined as the selecting of the relevant 
and useful information that uniquely identifies a subject of 
interest. The overall processing of the system must also be 
computationally efficient. However, the human face is a dy- 
namic object with a high degree of variability in its position 
orientation and expression. Noncooperative behaviour of 
the user and environmental factors including illumination 
effects also play an unfavourable role in the facial feature 
extraction process. These effects contaminate the extracted 
features. Consequently, accessibility to the same biometric 
features with the expected quality is obstructed because of 
these unavoidable challenges. Therefore, a vital issue in 
facial biometrics is the development of an efficient algo- 
rithm for a biometric authentication in order to overcome 
the aforementioned challenges [1-7]. 

This paper addresses the predominant deficiency of 
facial biometric. Afterward, it systematically investigates 
the facial biometric systems under the assumption that 
facial geometry is influenced by position orientation, facial 
expression, and illumination effects. This method addresses 
the two challenging issues of the facial biometric, quality 
and accessibility. In the proposed method, a new facial 
authentication algorithm is being developed to address 
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these issues. Furthermore, in this method, feature selection, 
extraction, and authentication systems have been processed 
in 2-D geometrical space. Each candidate facial feature is 
considered to be a collection of geometrical coordinates in 
the Euclidean domain. The Euclidean distance between the 
candidate feature coordinates is estimated and stored as a 
vector to create the biometric template. It is then compared 
to the stored template to authenticate the legitimacy of the 
subject of interest. 

The motivation of this method is its ability to select bio- 
metric features based on their quality and accessibility, then 
extract them to create the biometric template. Importantly, 
the variabilities of feature selection and extraction are pro- 
cessed without sacrificing efficiency in terms of computing 
time and memory usage. For the experimental evaluation 
of the proposed method, facial images are used from two 
public databases: the "Put Face Database" and the "Indian 
Face Database". The performance of the proposed method 
is evaluated based on Correct Recognition (CRR), False 
Acceptance (FAR), and False Rejection (FRR) rates. An 
Equal Error Rate (EER) of 3.49% and CRR of 90.68% have 
been achieved by the proposed method. The experimental 
results demonstrate the superiority of the proposed method 
in comparison to its counterparts. 

The remainder of the paper is organized as follows: 
Section presents the literature review related to the 
proposed method; the theoretical background is presented 
in Section Section IV represents the detailed analysis 
and algorithmic formulation of the proposed variability 
method; the results and analysis are presented in Section 
V\ and discussions and conclusions are included in Section 
VI. 

II. Literature Review 

The effects of position orientation, facial expression, and 
illumination on facial features are the vital issues of bio- 
metric authentication. Several studies have been conducted 
to address these issues. S. Du et al. [8] presented a review 
of facial authentication methods and their associated chal- 
lenges based on pose variations. Their methodologies were 
based on invariant features extraction in the multi- viewed 
and 3D range domain under different pose variations. 
However, the authors inadequately addressed the issue of 
variability due to the combined effects of facial orientation, 
expression, and illumination. One study conducted by the 
National Science and Technology Council [9] proposed 
a Linear Discriminant Analysis (LDA) method for facial 
authentication. The author used LDA to maximize the inter- 
class and minimize the intra-class variations, since PCA 
performance deteriorates if a full frontal face can't be pre- 
sented. Unfortunately, this model was designed for linear 
and homogeneous systems and faces challenges working 
with the underlying assumptions if there are an inadequate 
number of data samples in the received dataset. L. Chan et 
al. [10] proposed a linear facial biometric authentication 
system using PCA in conjunction with LDA. In that 



approach, the author used PCA for dimension reduction, 
while LDA was used to improve the discriminant ability 
of the PCA system. The main challenge with this method 
is that it is inadequate to deal with the combined effects of 
position orientation, facial expression, and illumination. E. 
Vezzetti et al. [11] presented a geometric approach to show 
the intra-class similarity and extra-class variation between 
different faces. This was an interesting study; however, its 
main objective was to formalize some facial geometrical 
notations, which can be used to analyze the behaviour 
of faces, hence the authentication system. B. Hwang [12] 
et al. constructed a facial database with different position 
orientations, facial expressions, and illuminations. Here the 
authors used PCA (Principal Component Analysis), Corre- 
lation Matching (CM), and Local Feature Analysis (LFA) 
algorithms to evaluate the performance and limitations of 
the facial authentication systems. However, they did not 
consider the variability in their feature selection method. 
F. Sayeed et al. [13] presented a facial authentication using 
the segmental Euclidean distance method. They used a 
variant of the AdaBoost algorithm for feature selection 
and trained the classifier to enhance the performance of 
the facial detection process. Afterwards, each face was 
segmented into nose, chin, eyes, mouth, and forehead 
as a separate image; then the Eigenface, discrete cosine 
transform, and fuzzy features of each segmented image 
were estimated. Finally, segmental Euclidean distance and 
Support Vector Machine (SVM) classifiers were used in the 
authentication process. Variability due to different facial 
poses has been considered in this method, however, it 
is inadequate to address the issues associated with the 
combined effects of facial expression and illumination. 

J. Li et al. [14] proposed a facial authentication sys- 
tem using adaptive image Euclidean distance. In this 
adaptive method, both spatial and gray level information 
were used to establish the relationship between pixels. 
Furthermore, two gray levels-namely, distance and co- 
sine dissimilarity-were considered between pixels. The 
authors claimed that their proposed method achieved a 
promising authentication accuracy using adaptive image 
Euclidean distance in conjunction with PCA and SVM. 
But, the authors did not adequately discuss the challenges 
encountered due to position orientation, facial expression, 
and illumination effects that need to be overcome without 
sacrificing efficiency and processing time. J. Kalita et al. 
[15] proposed an eigenvector features extraction method 
in conjunction with the estimation of minimum Euclidean 
distance method to authenticate the facial image. This is 
a very interesting and straightforward approach and the 
authors considered the challenges associated with facial 
expression. More importantly, this method would be able 
to detect the resultant facial expression of the input image. 
Unfortunately, the combined effects of expression, orien- 
tation, and illumination were not sufficiently addressed 
in this method. C. Pornpanomchai et al. [16] proposed 
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a human face authentication method using the Euclidean 
distance estimation process along with the neural network. 
In this method, a Correct Recognition Rate (CRR) of 96% 
at a cost of 3.304 sec (per image) processing time has 
been achieved. However, this method also did not address 
possible contamination from facial expression, orientation, 
and illumination effects. H. Lu et al. [17], presented a 
new PCA algorithm in an uncorrected multilinear PCA 
domain using unsupervised subspace learning of tenso- 
rial data. This system offered a methodology to maxi- 
mize the extraction of uncorrelated multilinear biometric 
characteristics. But it is an iterative process and is not 
sophisticated enough to deal with the combined effects 
of position orientation, facial expression, and illumination 
without compromising the computation complexity. The 
challenges associated with accessing the same biometric 
features weren't also addressed properly in that method. A 
Bayesian Estimator was conducted by M. Nounou et al. 
[18], addressing the problem associated with the MLE and 
PCA algorithms. Unfortunately, this method was developed 
under the assumption that the system is not vulnerable 
to the combined effects of illumination, expression, and 
position orientation. J. Suo et at. [19] developed a gender 
transformation algorithm based on hierarchy fusion strat- 
egy. In that approach the authors used a stochastic graphical 
model to transform the attributes of a high-resolution facial 
image into an image of the opposite gender with the same 
age and race image. The main objective is to modify 
gender attributes while retaining facial identity. This is an 
interesting model, however the authors did not consider the 
challenges of accessing the same biometric features, due 
to the associated heterogeneous nature. L. Lin et al. [20] 
proposed a hierarchical regenerative model using an "And- 
Or Graph" stochastic graph grammar methodology. In that 
model, a probabilistic bottom-up formulation was used for 
object detection, and a recursive top-down algorithm was 
used in the verification and searching process. Here, objects 
with larger intra- variance were broken into their constituent 
parts, and linking between the parts was modeled by 
the stochastic graph grammar technique. The authors also 
addressed the localization challenges due to the background 
clutter effect. But, the proposed verification process was 
developed in a homogeneous and controlled environment. 
In this method, the authors inadequately presented the 
challenges associated with the accession and extraction of 
the same features. 

Therefore, in most cases, the biometric features used 
in the authentication process are fixed. Consideration of 
variability during the feature selection and extraction pro- 
cess is necessary, since accessibility of the same biometric 
features may be difficult due to facial expression, posi- 
tion orientation, and illumination effects. In this paper, 
a new biometric authentication method is presented that 
addresses these effects and their impacts on accessibility 
and quality. Variability is being considered in this process 



to overcome the accessibility issue. Sequential Subspace 
Estimation [SSE] method studied in [21] has been used to 
ensure the quality of the extracted features. Furthermore, 
Euclidean geometry in 2-D computational vector space is 
being constructed for biometric features extraction [22]. 
Afterwards, the algebraic shape of the facial area, as well 
as the relative positions and size of the eyes, nose, and 
lips, have been estimated in order to encode and create the 
biometric templates. This encoded template is then stored 
in the biometrics database in order to be compared with the 
live input encoded biometrics in Euclidean vector space. 



III. Theoretical Background 



Unlike other facial authentication methods, the proposed 
method is developed in the Euclidean domain under the as- 
sumption that the quality and accessibility of the extracted 
biometrics face challenges due to position orientation, 
facial expression, and illumination effects. Therefore, this 
section presents a theoretical background before getting 
into a detailed analysis of the proposed method. 



A. Euclidean Vector 

The Euclidean vector measurement is a widely used 
method for representing points in geometrical space. In 
this case, both a vector and a point (scalar quantity) in n- 
D space can be represented by a collection of n values. 
But the difference between a vector and a point lies in 
the way the geometrical coordinates are interpreted. A 
point might be considered as a scalar way of visualizing a 
vector. The transformation between a vector and a point 
in the 2-D geometrical coordinate system is shown in 
Fig 1(a). A Euclidean vector can be represented by a 
line segment with a definite magnitude and direction. The 
algebraic manipulation process of the Euclidean vector in 
2-D geometrical space is shown in Fig. 1(b). In fact, all 
points in the Cartesian coordinate system can be defined 
in Euclidean vector space where a geometrical quantity 
is expressed as tuples splitting the entire quantity into 
its orthogonal-axis components. These points are scalar 
quantities that can also be used to estimate the algebraic 
relationship among the objects (images). 

Now, consider if n-tuple points in n- space can be rep- 
resented by R n , then two vectors, u = ui, U2, u%, , u n 

and v = vi,^2, V3, v n , shown in Fig 1(6) are equal 

if u\ — vi,U2 = V2,U3 = V3, u n = v n . Their other 
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properties can be presented as follows [23], [24]: 

U + V = Ui + Vi, U 2 + V2, U 3 + V 3 ....U n + V n 

fc(u + v) = feu + k\ 

The distance between two points u and v: 

v - u = Oi - Ui, v 2 - u 2 , v 3 - Us, , v n - u n ) 

||u — v|| = \/(u — v).u — V 
d (u,v) = ||(u- v)|| 

n 

a ~ Uj ^ 2 = V (^1 - U l) 2 + (^2 - ^2) 2 + 

^ t=l 

V 7 (>3 - U 3 ) 2 + + On - ^n) 2 

The magnitude: 

||u|| = v / ui= y^? +p 2 +p1 + +pI 

where k is a scalar quantity. 

The geometrical representation of u and v in R n is 
shown in Fig. 1. 

In the proposed method, using the same analogy, a 
Euclidean vector in 2-D geometrical space is being con- 
structed for a feature extraction, estimation, and authenti- 
cation process. In particular, each assigned point of the 
candidates' biometric features is considered to be a 2- 
D geometrical coordinate in the Euclidean vector space 
[22] . This feature extraction, estimation, and authentication 
process are presented in Section IV-B. 

B. Facial Anatomy 

Facial authentication is an everyday task, as humans 
can identify faces without extra effort. Typically, the face 
has inherent characteristics with distinguishable landmarks, 
different peaks, and approximately 80 nodal points [25]. 
Building an automated system to authenticate an individual 
using facial geometry can be done by extracting facial 
biometric features; including size or shape of the eyes, lips, 
nose, cheekbone, and jaw, as well as their relative distances 
(or positions) and orientation. Authentication typically uses 
an algorithm that compares input data with the biometrics 
stored in the database. The authentication process based 
on facial features is fast and accurate under favorable 
constraints, and as a result this technology is evolving 
rapidly. Unlike biometric authentication using other traits, 
authentication using facial biometrics can be done easily 
in public or in noncooperative environments. In this case, 
the subject's awareness is not required. A typical facial 
biometric pattern in 2-D geometrical space is shown in 
Fig. 2 [26],[27]. 

Face Databases 

In this method facial images from the two public 
databases, the "Put Face Database" and the "Indian 
Face Database", are used [29], [30]. The sizes of the 
two databases are presented in Table /. The "Put Face 
Database" is a highly nonlinear and heterogeneous 3D 



facial database. It contains approximately 20 images per 
person with a total of 200 people, and stores 2048 x 1536 
pixel images [30]. The main motivation for using the 
"Put Face Database" is that the diversity of the image 
subsets allows them to be easily used for training, testing, 
and cross-validation processes. This can occur because the 
images in this database have more than 20 orientations 
for an individual using various lightings, backgrounds, and 
facial expressions. In addition, the images in this database 
contain 2193 landmarked images [31]. A sample of the 
facial images from the "Put Face Database" is shown in 
Fig. 3. 

On the other hand, images in the "Indian Face Database" 
are less influenced by the facial expression, position ori- 
entation, and illumination effects. There are 40 subjects, 
each having 1 1 images with the same homogeneous back- 
ground. The size of each image is 640 x 480 and 256 
gray level per pixel. The main reason for using two types 
of databases is to find out the combined effects of two 
different environments. As well, it is important to show that 
the proposed method is the optimal solution for not only 
the images highly influenced by the underlying challenges, 
but also for the images that are less obstructed by the same 
reason. A sample of the facial images from the "Indian Face 
Database" is given in Fig. 4. 



TABLE I: The Details of Two Databases 



Databases 


Original Image Size (Pixels) 


Modified 


Put Face 


2048x1536 (color) 


256x256 (gray) 


Indian Face 


640x480 (gray) 


256x256 (gray) 



IV. Variability Modeling Method 

The studies of many facial biometric authentication 
methods have been based on the geometrical features 
extraction and selection process. As previously mentioned, 
most of those algorithms have been developed under the 
assumption that the extracted candidate features for the 
authentication process are fixed. However, there are chal- 
lenges in accessing the same facial geometric features, 
caused by effects due to facial orientation in the time 
domain. In addition, even if the facial features are ac- 
cessible, their quality is contaminated by expression and 
illumination, due to the dynamic properties of the human 
face and environmental factors, respectively. Some studies 
have also been conducted based on variabilities in the 
features extraction and selection process; but that method 
didn't consider the combined effects of facial expression, 
orientation and illumination. As well, in most cases, these 
variabilities were introduced at the cost of processing time, 
storage, and memory. The proposed authentication method 
is developed under the assumption that the extracted facial 
biometrics are vulnerable to position orientation, facial 
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Vector 
(numbers/points with 
magnitude and direction) 



Scalar 

(ordination number/point 
with magnitude) 



r 



k(U+V) = kU+kV 





(a) (b) 

Fig. 1: Euclidean Vector in 2-D Geometry. 




Fig. 2: Features in 2-D Geometrical Space [26], [27]. 



P. % r . b ■ » ■ § ft 

4,k ^<-.k ^* 4k Jk4,k >4 J V 



■« (Hi 




a, First Row -Original Image b. Second Row - Computation of Facial Boundaries c, Third Row - Extracted Face d, Fourth Row - Extracted Features 
Fig. 3: A Sample Facial Images - Put Face Database. 
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First Row -Original Image b, Second Row - Computation of Facial Boundaries c, Third Row - Extracted Face d, Fourth Row - Extracted Features 

Fig. 4: A Sample Facial Images - Indian Face Database 



expression, and illumination effects. More importantly, it 
is considered that these effects could cost the quality 
and accessibility of the desired features. Therefore, the 
proposed variability method is the compilation of two 
challenging issues: quality feature extraction (i.e. desired 
features) and variability of the authentication process (i.e. 
feature selection and its desired estimate). 

A. Quality Feature Extraction 

The challenges associated with position orientation, fa- 
cial expression, and illumination effects are the vital issues 
for the exploitation of facial biometrics. These effects 
obstruct the accessibility and deteriorate the quality of 
the biometric features. The Sequential Subspace Estimator 
(SSE) method studied in [21], [23] addressed the challenges 
of finding quality facial biometrics that are contaminated 
by these effects. In that method, a recursive sequential esti- 
mator algorithm is being developed in the image subspace. 
The system performed a sequential recursive filtering pro- 
cess in order to ensure that the biometrics are of good 
quality. The SSE approach is based on the minimization 
of noise and maximization of information contained in the 
received data, in MSE sense. 

Now, consider that the facial images have been received 
as vectors of matrix -x. Each row and column of the 
received dataset -x represents an observation and a par- 
ticular type of datum, respectively. If the received dataset 
is contaminated by noise, then the received images can be 
written as: 

x = s + n (1) 

where n is the noise matrix, and s is the noise-free or 
desired dataset. 

Principal components can be derived from the x dataset, 



and these derived components can be written as [32], [33]: 



Therefore using Eq. (1): 
z 



T 
W X 



w T s + w T n (2) 



where w represents weight vectors which map to each row 
vector of x, z is considered to be inherited (data) with 
maximum possible variance from the x dataset, and each 
of the weight vectors w is constrained to be a unit vector 
[34]. 

The MSE between the desired features and the processor 
output can be defined as follows [21], [23]: 



e(t) = d(t)-y(t) 



min MSE 

|w c ||=l 



E[|e(i)r 



(3) 
(4) 



The main objective is to determine the minimum value 
of the Mean Squared Error (MSE), i.e. Minimum Mean 
Squared Error (MMSE). With this,one would able to de- 
code the desired biometric features from the underlying 
noise environment to maximize the mutual information. 
The detailed analysis and formulation of the SSE algo- 
rithms has been studied in [21], [23]. 

B. Variability Method in Authentication Process 

The consideration of variability during the feature selec- 
tion and extraction process is unavoidable. The accessibil- 
ity of the same biometric features is a complex task since 
the human face is a dynamic object with a high degree of 
variability. In this case, Euclidean distance measurement 
is being used to formulate the proposed variability mea- 
sure. In this method, images are transformed into vector 
spaces and maintain a direct relationship between objects 
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in geometrical spaces. The main reason for using the 
Euclidean measurement in the proposed method is because 
it has the ability to represent these points as a collection 
of real numbers. Afterwards, these points are used to 
establish an algebraic relationship among the objects in 
the vector space, which are then transformed into linear 
scalar quantities. These quantities are flexible to manipulate 
and have the ability to respond to variabilities during the 
features' selection, extraction, and estimation processes. 

In the proposed Euclidean geometrical method, the de- 
tected face is represented in the 2-D geometrical domain. 
Afterwards, biometric templates are created from the ex- 
tracted facial area, eyes, lips, and nose, along with their 
relative positions. In this case, the proposed Euclidean 
geometrical method in conjunction with the Sequential 
Subspace Estimator (SSE) are used to overcome the chal- 
lenges associated with feature quality and accessibility due 
to facial expression, orientation, and illumination effects. 
More specifically, each extracted feature is considered to 
be a separate image. Thus four biometric templates are 
created from one facial image which can then be stored 
as a single template in the database system. This single 
template is treated as a template set for an individual and 
contains 4 subsets of templates. Furthermore, the features 
are transformed into a Euclidean metric where an estimate 
of the distance of a set of vectors is performed against 
a reference point '0' shown in Fig. 5. In this case, if 

P = [pi P2 P3 Pn] and q = [qi q 2 <? 3 q n ] are 

considered to be in R n and in the 2-D vector space, then 
the transformed metric P in the Euclidean domain satisfies 
the following condition: 



proposed method in 2-D vector space can be stated as 
follows: 



Pp.Pq 



p.q 



Such that: PP 7 = I (5) 
where P T is the transpose of P and I is an identity matrix. 

Euclidean Distance 

Consider two images that can be written as the vectors 

P = [pi P2 P3 Pn] and q = [<?i q 2 q 3 q n \- According 

to Section III-A, the distance between the two images in 
the Euclidean domain can be stated as follows: 



Normalized outcome: 

N 



a ^2(Qi~Pi) 2 
\ i=i 

= ^/(q-p) T (q-p) 

= ^(v-uHv-u) (6) 



A Euclidean metric matrix Q is being developed based 
on the normalized spatial distances (i.e. spatial relation- 
ships between two points) between the pixels of the re- 
spective biometric features. Therefore, the according to Eq. 
(5) and Eq. (6), the Euclidean geometrical formula for the 



M = ^/(v-u) T Q(v-u) 
Subject to: QQ T = I. (7) 
where M is the desired estimate. 

C. Biometric Template Matching 

The proposed method is developed under the assumption 
that the extracted biometric features are highly influenced 
by position orientation, facial expression, and illumination 
effects. More importantly, it has been assumed that the 
candidate biometric features to be extracted are not fixed 
and accessing them may be difficult due to this assumption. 
As a result, four biometric features including facial area, 
eyes, lips, and nose, along with their relative positions 
(i.e. O as reference point -Fig. 5) have been extracted 
from the facial image of an individual. Each is considered 
a separate image. These four templates are then stored 
(enrolled) as a single biometric template in the biometric 
database system. Therefore, the set contains four subsets 
of templates created from an individual's facial image. 
On the other hand, during the matching process, any 
two accessible biometric features along with their relative 
positions have been extracted from the live input facial 
image (i.e. test input or image). These two extracted images 
are used to create two subsets of biometric templates. Two 
test subsets have been selected and extracted based on the 
accessibility and quality of the features in the live input 
image. These two templates and their relative positions 
are then compared with the corresponding two of the four 
stored templates (i.e. 2 of the 4 subsets) in the database. 

Therefore, the biometric databases contain one set of 
templates for each individual, and each template contains 
four subsets of templates constructed from the extracted 
facial area, and size of the eyes, nose, and lips along with 
their relative positions. In this case, each set of biometric 
templates uniquely represents an individual's identity, as 
each subset identifies a specific feature of that individual. 
The system diagram of this process is shown in Fig. 6. 

D. Computational Complexity 

Computational complexity is an important issue for the 
proposed method. Starting with Eq. (4), computational 
complexity for the vector operation (matrix of vectors) is 
0(N 2 ), and for Eqs. (5) and (6) is also 0(N 2 ). 

V. Results and Analysis 

The variability method for the authentication (identifi- 
cation and verification) system was tested on the images 
from two public databases: the "Put Face Database" and 
the "Indian Face Database". In the experiment, we used the 
"Put Face Database" to create two sets of image databases: 
dBl and dB2, containing 30 and 50 subjects, respectively. 
Each database contains 10 images of each subject; thus 
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Fig. 5: Extraction of Facial Features - Put Face Database. 
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Fig. 6: Searching and Matching Process 



there were 300 and 500 images in databases dBl and dB2, 
respectively. In this process, 7 out of 10 facial images from 
each subject were used to train the system. The rest of 
the three subjects 7 images were used for testing purposes. 
The "Indian Face Database" was also used to create two 
sets of image databases: dB3 and dB4, containing 10 and 
20 subjects, respectively. Each database contains 6 images 
of each subject; thus there were 60 and 120 images in 
databases dB3 and dBA, respectively. In this process, 4 
out of 6 facial images from each subject were used to train 
the system. The rest of the two subjects' images were used 
for testing purposes. 

In both cases, we stored four biometric templates for 
an individual that were created from the facial area and 
size of the eyes, lips, and nose, along with their relative 
positions. However, comparisons between the input and 



the stored biometrics were done with any two available 
features along with their relative positions. Images were 
taken of different orientations and facial expressions, as 
well as under different lighting conditions. The maximum 
size of the training dataset was approximately 17.5 MB. 
Since the proposed biometric authentication method has 
two modes, identification and verification, the performance 
evaluation of the proposed method was conducted based on 
these two modes. 

A. Identification 

The experiment for the identification process was con- 
ducted using databases dBl, dB2, dB3, and dBA, . In 
this process, the received image was compared with all 
of the stored images in the database. There were 300, 
500, 60, and 120 images in databases dBl, dB2, dB3, 
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and dB4, respectively; therefore there were 300, 500, 60, 
and 120 sets (two templates for each set) of identification 
attempts. The performance of the identification process was 
evaluated using CRR, and their averages were recorded. 
Comparisons of the proposed method to the state-of-the- 
art algorithms PCA, LDA, and MLE were also recorded 
and are shown in Table II and Fig. 7. 

TABLE II: Performance Evaluation in (%)-CRR Compar- 
ison 



TABLE III: Performance Evaluation in (%) 
and EER Comparison 



FAR, FRR, 



Methods 


dBl 


dB2 


dBS 


dB4 


Average 


Proposed Method 


88.30 


86.25 


94.50 


93.65 


90.68 


PCA 


66.45 


59.80 


78.65 


74.80 


70.19 


LDA 


72.25 


67.35 


81.50 


78.45 


74.89 


MLE 


70.85 


66.05 


80.20 


76.65 


73.44 



B. Verification 

The verification of a genuine person was conducted by 
comparing the facial image of each person with the other 
facial images of the same person. Imposter processing was 
conducted by comparing the facial image of one person 
with the facial images of other persons. There were 90, 
150, 20, and 40 testing samples for databases dBl, dB2, 
dB3, and dB4, respectively; therefore there were 90, 150, 
20, and 40 sets (two templates for each set) of genuine 
matches. The verification performance was evaluated using 
the False Acceptance Rate (FAR), False Rejection Rate 
(FRR), and Equal Error Rate (EER). The percentages of 
FAR and FRR and the corresponding EER points were 
determined and the experimental results were recorded. 
Comparisons of the proposed method to the state-of-the-art 
algorithms PCA, LDA, and MLE were also collected and 
shown in Tables III — V, and Figs. 8 — 11. The average 
execution time for each database is given in Table VI. 



VI. Discussions and Conclusions 

The proposed variability method addressed two impor- 
tant issues of facial biometrics-quality and accessibility- 
for biometric authentication. In this experiment, it is as- 
sumed that the associated challenges during the feature 
selection and extraction process are due to the combined 
effects of position orientation, facial expression, and illumi- 
nation on the biometric features. A variability method for 



Methods 


dBl 


dB2 




FAR 


FRR 


EER 


FAR 


FRR 


EER 


Proposed Method 


0.87 


6.10 


3.65 


3.75 


8.70 


5.80 


PCA 


8.60 


9.25 


10.1 


9.50 


13.40 


15.65 


LDA 


7.65 


5.30 


8.20 


4.55 


12.85 


12.37 


MLE 


7.20 


8.90 


9.50 


8.75 


12.65 


14.25 



TABLE IV: Performance Evaluation in (%) - FAR, FRR, 
and EER Comparison 



Methods 


dB3 


dB4 




FAR 


FRR 


EER 


FAR 


FRR 


EER 


Proposed Method 


0.82 


3.55 


1.85 


0.84 


3.85 


2.65 


PCA 


2.15 


4.25 


5.15 


3.25 


5.30 


7.45 


LDA 


1.50 


3.85 


3.75 


3.85 


4.60 


5.90 


MLE 


1.35 


3.75 


2.50 


3.25 


4.15 


6.50 



facial authentication has been developed in the Euclidean 
2-D vector space. The extracted biometrics are being 
considered as a collection of points in the 2-D geomet- 
rical coordinate system. In this experiment, two different 
databases dBl and dB2 have been created from the "Put 
Face Database", which contains 30 and 50 subjects, each 
with 10 images. As well, two databases dB3 and dBA have 
been created from the "Indian Face Database" that contains 
10 and 20 subjects, each with 6 images. The "Indian 
Face Database" is less influenced by the effects from 
various lightings, backgrounds, and facial expressions. The 
main reason for using two different public databases is to 
test the proposed variability method under two different 
environmental conditions and discover the average effect 
of the facial authentication process. Furthermore, in both 
cases, four biometric templates (from an individual image) 
using extracted facial area, eyes, lips, and nose features 
were created, respectively, and stored in the database as a 
single template for an individual, each set with 4 subsets of 
templates. During the comparison process, two templates 
have been created from the extracted live input biometrics. 
These templates were compared with two of the four 
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Fig. 7: Identification - Performance Comparison 
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Fig. 8: Verification - Performance Evaluation. 
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Performance Evaluation —FAR and FRR ROC Curve -Performance Evaluation 
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Fig. 9: Verification - Performance Evaluation. 
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Fig. 10: Verification - Performance Evaluation. 
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Performance Evaluation —FAR and FRR ROC Curve -Performance Evaluation 
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Fig. 11: Verification - Performance Evaluation. 



TABLE V: Performance Evaluation in (%)-EER Compar- 
ison 



Methods 


Put Face 


Indian Face 


Average 


Proposed Method 


4.73 


2.25 


3.49 


PCA 


12.88 


6.30 


9.59 


LDA 


10.29 


4.83 


7.56 


MLE 


11.88 


4.50 


8.18 



TABLE VI: Average Execution Time in Seconds 



Authentication 


dBl 


dB2 


dBS 


dB4 


Identification 


35.40 


57.16 


12.52 


19.39 


Verification 


4.34 


5.41 


2.57 


3.25 



corresponding stored subsets of templates. 

The experimental results of the authentication process 
are recorded in Tables II— VI, and the Receiver Operating 
Characteristics (ROC) curves of the proposed method based 
on the four databases are also included. This ROC curve 



measures the performance of the verification system. FAR 
and FRR presented in the ROC curves characterize the 
verification accuracy, and the point EER represents the 
performance of the verification system. The experimental 
results of the verification process are recorded in Tables 
III — V. In addition, the performance of the identification 
process for the proposed method is evaluated based on 
CRR, and these results are also recorded in Table II. 
Furthermore, the simulation outcomes for the identifica- 
tion and verification are presented in Figs. 7 — 11. More 
importantly, the performance of the proposed method is 
analyzed and compared with three state-of-the-art algo- 
rithms, namely PCA, LDA, and MLE. The experimental 
results show that the proposed method outperforms its 
counterparts with a promising CRR of 90.68% and an EER 
of 3.49%. 
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Abstract — Remote user authentication plays the most fun- 
damental procedure to identify the legitimate users of a 
web service on the Internet. In general, the password-based 
authentication mechanism provides the basic capability to 
prevent unauthorized access. Since, many researchers have 
proposed a number of password based authentication schemes 
which rely on a single channel for authentication. However to 
achieve a better security, it is possible to engage multi-channels 
for authenticating users. In this paper, we propose an efficient 
one time password (OTP) based authentication protocol over 
a multi-channels architecture. Where, the proposed protocol 
employing the RC4-EA encryption method to encrypt the plain- 
OTP to cipher-OTP. Then, Quick Response Code (QR) code 
is used as a data container to hide this cipher- OTP. Also, the 
purpose of the protocol is integrate a web based application 
with mobile-based technology to communicate with the remote 
user over a multi-channels authentication scheme. The main 
advantage of the proposed protocol is to highly secure the 
authentication system by preventing the OTP from eaves- 
dropping attack. Also, by integrating a Web-based application 
with mobile-based technology as a multi-channels scheme; the 
proposed protocol helps to overcome many challenging attacks 
such as replay attack, DoS attack, man-in-the-middle (MITM) 
attack, real-time phishing (RTP) and other malware attacks. 

Keywords -Authentication; Multi- Channel Authentication 
(MCA); Data hiding; Quick Response Code (QR) code; 
Encryption. 

I. Introduction 

Internet has become the most convenient environment for 
businesses, education, bill-paying and E-commerce around 
the world [1]. Thus, internet security is an important issue 
to prevent the confidential information from being accessed 
by unauthorized users [2]. Remote authentication of users is 
recently one of the most important service on the internet. 
Where, remote user authentication is the process of identi- 
fying a legitimate user of a particular web service on the 
internet[3]. 

Most authentication schemes using a smart card, debit 
card, or Asynchronous Transfer Mode (ATM) to restrict a 
resources [4]. These schemes are impractical due to their 
infrastructure requirements [5]. According to their low cost, 
efficiency and portability, Passwords are the most common 
and convenient way to authenticate the remote user [6]. 
However, such passwords become a sensitive target for 



the attackers which lead to compromise the authentication 
schemes [7]. Thus, using one time password (OTP) is an 
efficient way to secure the authentication scheme. Where, 
OTP is the identity password of a user which changes with 
every user login [8]. 

This paper proposed one time password (OTP) authen- 
tication protocol for remote user login. Where, the plain- 
OTP is encrypted in the form of cipher-OTP using RC4-EA 
encryption method in order to keep it secret [9]. Since the 
crypt-systems have over grown, it would not be enough to 
encrypt the stuffed contents of the plain-OTP. Hence, we 
need to work on the inevitability that its existence should be 
kept secret. Thus, Quick Response code(QR) code is used as 
a data container to hide the cipher-OTP [10]. Also, to ensure 
safe and secure remote user authentication, multi-channels 
authentication (MCAs) is used [11]. Where, the idea behind 
using MCA is to ensure integrity and authenticity of user 
authentication [12] . So that, for an attacker to compromise 
a user account; different independent channels have to be 
compromised first before gaining full access to the user 
account [13]. 

The advantages of the proposed user authentication 
protocol are to prevent the OTP from eavesdropping 
attack by adopting the RC4-EA encryption method and 
the QR-code technique. Also, to overcome the drawback 
of the man-in-the-middle/browser (MITM/B), real-time 
phishing/pharming (RTP/P) and malware attacks; by 
integrating a Web-based application with mobile-based 
technology as a multi-channels. 



The rest of this paper is organized as follows: Section II 
presents an overview of one time password technique (OTP), 
Dynamic RC4-EA encryption method, Data hiding using 
QR-Code and Multi-Channels based authentication. Section 

III introduces the proposed authentication protocol. Section 

IV gives the implementation and security analysis. Finally, 
Section V contains the conclusion remarks. 
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II. An overview 

A. One Time Password Technique (OTP) 

One Time Password (OTP) authentication is used to pro- 
vide the security of websites and to minimizes the potential 
of unauthorized access [14]. The concept behind OTP that; 
it can be used only one time, where it is only valid for one 
login session or for a very short period of time [15]. Even 
if an attacker is capable of obtaining this user credential 
OTP, it may either no longer be valid or be prohibited from 
additional use . OTP can help in mitigating a typical phishing 
attempt or a replay attacks [16]. A various algorithms for the 
generation of OTPs are listed below [14]: 

1) Based on time- synchronization between the authenti- 
cation server and the client providing the password, 
where OTPs are valid only for a short period of time. 

2) Using a mathematical algorithm to generate a new 
password based on the previous password, where 
OTPs are effectively a chain and must be used in a 
predefined order. 

3) Using a mathematical algorithm where the new pass- 
word is based on a challenge (e.g., a random number 
chosen by the authentication server) and/or a counter. 

B. Dynamic RC4-EA Encryption Method 

Cryptography plays a major role to prevent eavesdropping 
of sensitive information [17]. ElDahshan et. al. proposed a 
dynamic RC4-EA method [18]. It is used for encrypting 
and decrypting the plaintext. The advantage of the RC4- 
EA method is to increase the security of the system, 
by generating the secret keys dynamically. Where, the 
Evolutionary Algorithm (EA) is adapted to generate a 
dynamic secret key as a seed used in the RC4 encryption 
algorithm. Hence, the final keystream can not be cracked 
by the attacker. Then, XOR operation is performed with 
this final keystream generated from the RC4-EA method on 
the plaintext to obtain the ciphertext and vis versa [18]. 

C Data Hiding Using QR-Code 

It is essential that in order to hide the information, we need 
a data container that may be used suitably according to the 
purpose. The data container may be an image, a video or a 
Quick Response Code (QR) code [7]. QR code is developed 
by Japanese Denso Wave corporation in 1994 [10]. It is a 
two dimensional array. The QR code can hold a considerably 
greater volume of information: 7, 089 characters for numeric 
only, 4, 296 characters for alphanumeric data and 2, 953 
bytes of binary (8 bits) [19]. The QR code includes an 
encoding region and function patterns: the encoding region 
is used to store the data, and the function patterns include 
position detection patterns, separators for position detection 
patterns, timing patterns and alignment patterns [20]. 

To generate a QR code the string of bits are needed. This 
string includes the characters of the original message, as 



well as some information bits that will tell a QR decoder 
what type of QR Code it is. After generating the string 
of bits; the Reed-Solomon technique is used to generate 
Error Correction [21]. The resultant data from string of 
bits and the Error Correction is used to generate eight 
different QR Codes, Each of which uses a different mask 
pattern. A mask pattern controls and changes the pixels 
to black 0 or White 1. Which makes sure that the QR 
code doesn't contain patterns that might be difficult for 
a QR decoder to read [21]. Finally, the QR Code which 
uses the best mask pattern is generated as shown in figure 1 . 




Finder Pattern 



Separators 
►Timing Pattern 



Remainder Bits 



-Format Information 



Figure 1. Structure of QR Code 



D. Multi-Channels base Authentication (MCA) 

Authentication is an important aspect of a secure systems, 
where a user proves his identity by revealing his certain 
secrets possesses [2]. Most authentication schemes have 
proposed using a single channel to authenticate users. 
These schemes have undoubtedly improved security but 
have not eliminated the possibility of some kinds of 
attacks such as; man-in-the-middle/browser (MITM/B), 
real-time phishing/pharming (RTP/P) and malware. 
Therefore, researchers have come up with other schemes 
to overcome these drawbacks such as multi-channels 
authentication(MCA) (i.e., web channel combined with 
mobile network channel)[13]. 

In theory, MCA offers superior security over single 
channel authentication schemes. That is, for an attacker to 
compromise user account, different independent channels 
have to be compromised first before gaining full access 
to the user account [13]. Also, MCA makes it impossible 
for non-targeted attacks to successfully compromise user's 
accounts; especially if the attacker is not geographically 
close enough to the user to gain access to designated 
devices used by some channels. 



15 



http://sites.google.com/site/ijcsis/ 
ISSN 1947-5500 



(IJCSIS) International Journal of Computer Science and Information Security, 
Vol. 13, No. 6, June 2015 



III. The Proposed Multi-Channel User 
Authentication Protocol 

The major aim of the proposed protocol is to eliminate 
the drawbacks of password guessing attack . The proposed 
protocol uses OTP encrypted by RC4-EA method, then hid- 
ing cipher-OTP in QR code. Also, it integrates a web-based 
applications and mobile devices for user authentication over 
multi-channels. The proposed protocol involves two parties 
: a server (S) and a remote user (U). Each authorized U 
can request service from S with the granted access rights. In 
addition, each U got an electronic mail and hold a mobile 
device. The protocol consists of four phases : initialization 
phase, registration phase, login phase and authentication 
phase. The notations employed throughout this paper are 
shown in table I. 



Table I 
Notations 



Notation 


Description 


U 


Remote User 


Uid 


User Identity 


Upw 


User Password 


Uip 


User IP Address 


Uwip 


A White list of Allowed IP Addresses 


u Prox 


User Using Proxy 


U M 


User Mobile 


U e 


User Electronic Mail 


s 


The Server 


MO 


One-Way Hash Function 


a 


Secret Key Used in RC4-EA Method 


(E/D) RC 4-EA 


Encryption / Decryption Using 




RC4-EA Method 


(E/D) QR (.) 


Function that Encodes/Decodes 




Data into (QR) Code 


II 


Concatenation 


T 


Time Stamp 


ri,r 2 


Random Nonce Generated by the Server 


T c , T end 


Time Created, Ended of Random Nonce 



A. Initialization Phase 

In this phase, Internet Protocol Authentication (IPAuth) 
is a protocol suite for securing internet communications by 
authenticating each IP packet of a communication session. 
IPAuth takes place between two parties of a server and a 
user. The various steps of IPAuth will be explain below: 

1) Assume that U request from S to join the system. 

2) The S will check U p rox • 

// U access the system using proxy, 
then S block the U connection. 

3) The S get U IP . 

4) The S check the white list of IP addresses. 

if (Uip == U W ip)> 

then U authentic and open connection 
else 

Reject connection and block U 



B. Registration Phase 

In this phase, U registers with the S in order to use a 
service. U and S execute the following steps: 

1) U chooses an identity Uid, electronic mail U e , mobile 
number Um, and password Upw- Then computes 
Xu = h (Uid\\Upw)' Then sends {Uid, U e , X Uf 
Ti} to S via a secure channel. 

U S : {Uid^Um.Xu^} (1) 

2) S examine the time stamp T\. If it is invalid, then 
rejects it. Otherwise, checks whether Uid, U e , Um 
is available for use. If it is, S computes Yjj = 
h(Xjj\\Uip). Finally, S stores the values Uid, U e , 
Um and Yjj in its database. 

S -+ DB : {Uid^UmM (2) 

C. Login Phase 

The Login phase is shown in the following steps: 

1) U enter his Uid and Upw, and compute 
X' v = h(Ui D \\Up W ), then send Uid, X' v , T 2 
to S. 

U -+ S : {Uid^X^T^ (3) 

2) S examine the time stamp T 2 . If it is invalid, then 
rejects it. Otherwise, S computes Y v = h(X u \\Uip), 
then checks whether Uid is valid and Y u == Yjj. If it 
is, allowed user login. Otherwise, S ask U a maximum 
3 attempts to provide his correct Uid and Upw 

If U exceed this threshold, then S consider U as an 
attack and block his account. 

D. Authentication Phase 

After U has a successful login. Now S wants to 
authenticate U upon multi-channels by generating One- 
Time QR (OTQR) and One-Time Password OTP. This 
phase is divided into two processes: 

Authentication by Email channel process: 

1) S generate a random nonce r±, then computes Kjj = 
E RC 4-EA(ri), then computes M v = (E) QR (Ku). 
Finally, S stores M Uf T c , T end , where M v is OTQR. 

S DB : {Mu,T c ,T end } (4) 

2) S sends Mjj, T3 to U via mail channel. 

3) U examine the time stamp T 3 . If it is valid, U send 
M'jj, T3 to S. 

4) S checks whether T c ^ T3 ^ T end and M' u == M v . 
If it is, then user is authentic. Otherwise, not authentic 
user. 
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Authentication by Mobil channel process: 

1) S generate a random nonce r2, then computes Fjj = 
hfa). Finally, S stores Fjj, T c , T en ^, where Fjj is 
OTP. 

S -+ DB : {Fu,T c ,T end } (5) 

2) S sends r^, T 4 to U via mobile channel, then discards 

3) U examine the time stamp T4. If it is valid, U enter 
V2, then compute F v = hfo) and send F v , T 4 to S. 

4) 5 checks whether T C <T±< T end and F' v == F v 
is valid. If it is, then user authentic. Otherwise, not 
authentic user. 

Now If OTQR and OTP holds, then server S is convinced 
that User U is validated. Otherwise, the request is rejected. 

IV. Implementation and Security Analyses 

Instead of using the traditional smart card for remote user 
authentication. The proposed user authentication protocol 
is adopting the RC4-EA encryption method to encrypt the 
plain-OTP, then it is hiding the cipher-OTP in QR code. 
The users electronic mail and mobile device takes the 
responsibility for receiving the OTQR and the OTP as a 
multi-channels to achieve mutual authentication between the 
U and S. 

The performance of the proposed authentication protocol 
is tested using server 32 core AMD opteron processor 6376 
with 32 GB of RAM and 4 RAID Is, laptop (Intel i5, 1.80 
GHz processor, 2 GB RAM) and simple mobile phone. 
The experiments have been implemented using PHP-MySql 
language environment. 

A. Implementation 

The proposed user authentication protocol is very robust, 
secure, reliable and very hard for illegitimate users to crack. 
By implementing the OTQR/OTP techniques, it can help in 
mitigating a typical phishing attempt. Whenever user wishes 
to login the website, first step is that the U coming from 
white list of Allowed IP Addresses Uwip- Second step 
is to enter Um and Upw for remote User authentication. 
Once U is login and gets the OTQR/OTP by Email/SMS 
on his registered an electronic mail and a mobile number 
respectively. The server will store the OTQR/OTP and the 
date created (DC). The OTQR/OTP with status value 1 is 
valid which signifies that it can still be used by U. The 
moment U uses the generated OTQR/OTP. The OTQR/OTP 
expires and its status value changes from 1 to 0 then the 
register OTQR/OTP date used (DU). But, whenever U not 
uses the OTQR/OTP after a period of 5 minutes it will 
expire and its status value changes from 1 to 2 as shown in 
tables II, III, IV. 



Table II 

User Login Table for One Time Password 



U.N 


Password 


Email 


Mobil No. 


aqwers 


895*/66! 


aqwers @ egywow.com 


96895635810 


twerffr 


P**2334 


twerffr@egywow.com 


96890125612 


yuhfrd 


Ad2*!98 


yuhfrd @ egy wow. com 


96695254523 



Table III 

Login table to the main website with OTQR via email 



U.N 


OTQR 


DC 


Status 


DU 


aqwers 




2015-05-24 
18:50:15 


1 (Valid) 


Ready to 
Use 


aqwers 




2015-05-24 
17:47:43 


0 (Expired) 


2015-05-24 
17:49:15 


aqwers 




2015-05-23 
18:31:38 


2 (Expired) 


Not Used 














Table IV 






Login table to the main website with OTP via SMS 


U.N 


OTP 


DC 


Status 


DU 


aqwers 


F21P40Ui 


2015-05-24 
18:50:15 


1 (Valid) 


Ready to 
Use 


aqwers 


nH8XxG62 


2015-05-24 
17:47:43 


0 (Expired) 


2015-05-24 
17:51:15 


aqwers 


B0Ej0PF6 


2015-05-23 
18:31:38 


2 (Expired) 


Not Used 



B. Security Analyses 

The security of the proposed protocol is analyzed under 
the possibilities of the types of attacks listed below: 

1) Prevent Replay Attack : In this type of attack, 
the intruder gathers the communication messages 
exchanged between the U and S\ then tries to replay 
the same messages acting as a legitimate user. In 
the proposed authentication protocol, the random 
nonce values r\,V2, and a with time stamp T are 
generated for each session, and the parameters in all 
the messages are all related to them. Those values 
are verified by S as in equations 4,5 . The S checks 
at what time interval T the request is received. If the 
time stamp are not within the time interval, the server 
S will reject the intruder's attempt to access the 
service. Therefore, the proposed protocol is secure 
against replay attack. 

2) Prevent Man-in-the-middle Attack : In this type of 
attack, the malicious user listens to the communication 
channel between S and U. In proposed authentication 
protocol, the intruder may intercept the web/mobile 
communication messages, but he will never be able to 
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compute the OTQR and the OTP. Since, it is based on 
random nonce values, which is chosen fresh for each 
new session. Hence, the protocol is secure against 
man-in-the-middle attack. 

3) Prevent Denial of service attack (DoS) : At DoS 

attack, the attacker may flood a large number of 
illegal access request to S. The DoS attacks aim is 
to consume S critical resources. By exhausting these 
resources, the attacker can prevent S from serving 
legitimate U. In the proposed authentication protocol, 
for every access request from any user U to S; S 
checks the Up rox and Uip as explain in III-A. Thus, 
the proposed protocol does not suffer from DoS 
attacks. 

4) Prevent Website Manipulation: One of Website 
Manipulation attack is SQL Injection. SQL Injection 
attack is a hacking technique which attempts to 
pass SQL commands through a web application; 
to be executed by the back-end database. SQL 
Injection is useless in the proposed authentication 
protocol, since the proposed protocol uses the 
"mysql_real_escape_string()" command. Thus, the 
proposed protocol is secure against SQL Injection 
attacks. 

5) Prevent Phishing Attack Via the Web : Phishing 
is a form of online identity theft that aims to steal 
sensitive information. In the proposed authentication 
protocol, if the intruder knows Um and can get the 
Upw from the server by replacing the actual web 
page with a similar one, it would be difficult to get the 
OTQR and OTP because it send over multi-channel. 
Which has to be chosen within a specified time stamp 
as in equations 4,5. 

6) Prevent KeyLoggers Attack : KeyLoggers are 
applications or devices that monitor the physical 
keystrokes of user computer. Then they are gathering 
the information for later retrieval or send it to 
a spyware server. KeyLoggers is useless in the 
proposed authentication protocol, since the proposed 
protocol uses the (Virtual Keyboard) which prevent 
the keylogger attacker to record the U sensitive data. 
Thus, the proposed protocol is secure against the 
keylogger attack. 

V. Conclusions 

The major contribution of this paper, is proposing a multi- 
channel user authentication protocol. The proposed protocol 
enhances the security of a remote user login. The proposed 
protocol adopted the one-time password (OTP) which is 



encrypted using the RC4-EA encryption method, then hiding 
the cipher-OTP using the QR code technique. Therefore, 
the data can not be easily retrievable without adequate 
authorization. Also, the purpose of the paper is to integrate 
a web based application with mobile-based applications to 
make it more secure than the general authentication methods. 
The integration of web and mobile-based applications is a 
multi-channel authentication scheme that is better than a 
single-channel authentication. Thus, the proposed authenti- 
cation protocol is more convenient, because the burden of 
carrying a separate hardware token is removed. Moreover, 
this protocol helps to overcome many challenging attacks 
such as replay attack, DoS attack, man-in-the-middle attack 
and other malware attacks. 
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Abstract — Radio Frequency Identification (RFID) is 
programmed ID innovation without contact, support 
motions via radio recurrence programmed ID which give 
pertinent destination information, without requirement 
direct mediation of distinguish school children for 
learning an assortment to study surroundings. Since 
schools and vocational institute are providing training 
framework stream through unmatched data, cannot 
fulfill more reasonable for upcoming study interest. 
Internet of Things (IoT) overwhelmed customary flaw 
for structure code, which support to university, school or 
worldwide group of vocational training greatest concern 
and examination. 

Keywords: RFID Innovation, Internet of Things, Future 
Application 

I. Introduction 

Internet of Things (IoT) [1, 2] are characterized as 
combination of RFID [3, 4], infrared sensors, laser 
scanners, worldwide situating frameworks, and 
supporting data detecting gadget, as per the concurred 
convention, to any article joined with the Internet up to 
data trade and correspondence, keeping in mind the end 
goal to accomplish shrewd distinguish, find, track, 
screen and deal with a system. IoT ideas are set 
advancing in 1999 [5, 6]. IoT is the "material articles 
joined with the Internet". It has two implications: 
initially, the center of systems administration and 
framework keeps on being the Internet, within the 
Internet premise of the expansion, development of the 
system; another client-end stretched out till extended to 
some articles, data trade and correspondence. 

RFID's are an innovation for critical educational 
training esteem and tremendous prospective. RFID 
guarantees for supplant ancient scanner tag besides 
adds ongoing deceivability of analyzing, paying little 
heed to the area of the school network. We discover 
RFID applications in different fields, yet its 



fundamental utilization is in following student RFID 
(resources). This can be used for university, school and 
vocational training for examining stream with IoT. 

• To promote the objectives of the whole 
education community IoT. 

• To highlight opportunities for research and 
innovation for educational or vocational 
training. 

• To identify the current state of technology and 
identify future requirements for school. 

• To introduce the future application for school 
community to new era of RFID using IoT. 



II. 



LITERATURE RIVEW 



In its easiest structure, RFID [7, 8] is an idea like 
standardized identification innovation, yet without 
obliging an immediate perceivability of the checked 
substances as presented in Figure 1 . Much the same as 
standardized tag frameworks oblige a legitimate optical 
peruser and unique labels connected on RFID's, RFID 
needs a peruser hardware and exceptional labels or 
cards appended to the student's RFID's all together for 
the readers to be followed 

Antenna 




Computer 



Figure: 1 Conceptual diagram of RFID Reader 
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RFID's have an extensive history for a piece to mechanical 
transformation together present besides historical [9, 10]. 
RFID empowers brisk installment for rings in addition speedy 
ID of things. Furthermore, RFID gives advantages, for 
example, following resources, checking conditions for 
wellbeing, and serving to anticipate duplicating. RFID has 
indispensable influence in the innovative insurgency 
alongside the Internet in addition cell phones that have been 
associating the world collected wholly RFID frameworks, 
which comprise three fundamental segments. The primary of 
RFID's labels, which are appended to a benefit otherwise 
thing. The label (tag) encompasses data approximately, which 
advantage otherwise thing furthermore possibly will 
consolidate sensors and devices [11]. 




Figure 2: RFID Tag, RFID Reader or Interrogator and 
Computer connected with Internet. 

The next part of RFID's cross examiner that corresponds 
through (likewise entitled questioning) RFID's labels [12]. 
The last part is the software backend framework that provide 
interfaces RFID's investigators with an incorporated database 
of schools [13]. This unified database encompasses extra data, 
for example, cost, on behalf of RFID labeled thing. 

RFID's advances might be grouped keen on three classes: 
inactive RFID's, dynamic RFID's, and last semi aloof RFID's 
[14, 15]. Taking into account the radio recurrence utilized, the 
uninvolved RFID advances are generally classified keen on 
low recurrence (LF - Low frequency) RFID, high recurrence 
(HF - High Frequency) RFID, microwave RFID and ultra- 
high recurrence (UHF- Ultra High Frequency) RFID [16, 17]. 

RFID innovation has been boundless and these days, this 
could be originate popular numerous uses. Approximately of 
RFID's used to be present RFID scanner, RFID printer, RFID 
radio wire and RFID peruser. Radio recurrence recognizable 
proof or also called RFID depict a framework, which 
communicates the character on an article that individual 
remotely utilizing waves of radio as a part of the type for 
special number of serials [18]. 

A RFID framework could encompass a few segments: labels 
transponders, label perusers, reception apparatus, and 
interface [19]. In an ordinary RFID framework, individual 
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articles are outfitted with a little, reasonable tag for device. 
This encompasses a transponder through an advanced chip for 
memory, which agreed an interesting electronic item code for 
tag. This questioner, a reception apparatus bundled through a 
handset in addition decoder, which transmits sign enacting 
RFID's label for reading and compose information on this. At 
the point while a RFID's label goes over this electro-magnetic 
region, which identifies peruser's actuation indication [20]. 
The peruser interprets coming encoded information with 
label's coordinated circuit of device and encoded information 
forwarded to end computer. This process of application's 
programming arranged at end computer forms the 
information, which execute different separating procedures to 
lessen various frequently excess peruses similar tag and label 
for littler in addition more helpful information sets of 
information [21]. 

III. The application of RFID in IoT 

Despite the fact that RFID technologies have been about 
almost three decades, this is just as of late that this innovation 
has been increasing critical energy because of the merging of 
cost cutting, which expanded capacities on labels of RFID 
[22]. At present, RFID is rising as a vital innovation for 
changing an extensive variety of utilizations, including store 
network administration, retail, air ship upkeep, hostile to 
forging, stuff taking care of, and health awareness [23]. It 
additionally proclaims the rise of economical and profoundly 
viable pervasive PCs that will have emotional effects on 
people, associations, and social orders. Numerous associations 
are arranging or have effectively misused RFID in their 
fundamental operations to exploit the capability of more 
mechanization, productive learning procedures, and exam 
perceivability [24]. Case in point, late news demonstrates that 
top retails corporations have lessened by 30 percent stock out 
and large in the wake of dispatching its RFID program. 
Numerous forecasts concur, which RFID's provide new era of 
ventures with billions worth. 

Future application of IoT undertakings could oversee each 
item continuously, and deal with their school building design. 
They not just manage the course in store network and offer 
data, additionally break down the data produced from each 
method and figure. By determining the data from the present 
system of students RFIDs, the future pattern and likelihood, 
which mischance occurs is evaluated, cure methods could be 
received and move ahead to notice. It could enhance ventures' 
capacity for reacting at school business [25]. 

IoT could influence entire school network. Firstly this could 
streamline content network administration; besides this could 
mark sources which are utilized viably; thirdly this could 
create entire content network obvious, which enhance coming 
data on store network straight forwardness; fourthly this store 
network could be overseen continuously; this conclusion can 
make the content network high spryness and complete joining 
for study environment [26] . 
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Figure 3: Internet of Things schematic demonstrating the end clients and application zones taking into account information. 



IoT influences the inventory network administration in 
assembling connection, warehousing connection, 
transportation connection and offering connection [27]. This 
creates schools almost entire learning transformation reaction 
for differed academic business rapidly, which provide 
versatility for school network to academic business check 
variations is moved forward. 

IV. Methodology 

However RFID are already looking forward viewed as an 
innovation, this selection over many mixture of commercial 
enterprises has seen it turn into a great deal more typical. As 
RFID is as a rule more predominant over an assortment of 
commercial enterprises, school associations looking to pick up 
an upper hand are now using the innovation in a mixture of 
creative ways the school has not seen some time recently [28]. 
The inquiry that numerous schools are presently asking is: 
which places are RFID tags moving? This may be answered 
as it appears that innovation splendid future through extra 
esteem included components showing up at comparable 
expenses. 

Presently, RFID's are changing utilities operations for 
utilizing shrewd patterns to gather in addition communicate 
the measure force devoured into family unit. Brilliant [28] 
meters are an illustration of an innovation that is in a general 
sense varying market procedures by recording utilization for 
electric utilities on normal interims besides conveying this 
utility used for checking then charging from back. 
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Figure 4: Future application scenario for school using RFID 
on IoT 
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Figure 5: Framework of RFID application on IoT 
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Figure 6: Functionality of RFID application on IoT 

At the point when the students get the item with RFID hostile 
to forging name, they utilize the mark code for right of entry 
to school network against falsifying data administrations 
address through RFID -empowered cellular telephones or 
Internet-joined PCs furnished through read-compose labels of 
RFID, and afterward request administrations to achieve item 
correlated data for recognize this legitimacy on item. 
Compelling on RFID -empowered cell telephone to instance of 
demonstrate particular hostile to duplicating steps when the 
buyer needs to distinguish the realness of the item. 

Primarily, the student acquires the school community against 
falsifying server address as of the study item portrayal or 
different recognizable pieces of proof straightforwardly. At 
the point when accepting the RFID-labeled items, the student 
uses the RFID-empowered telephone to stay on school Web 
site and keep a copy system records that school through main 
server taking after reminders. Subsequent to joining and 
running using the school server, the telephone drives keen on 
the intelligent procedure. 



Figure: Process flow of RFID application 

Also, the purchaser utilizes the telephone to peruse the item 
tag or label to accomplish RFID code for items and permission 
this to opposition to duplicating on school server. When 
encrypted code equals using the standard of RFID coding on 
school RFID's, the main server inquiries own security 
cryptographic calculation for creating its irregular information 
besides goes on student telephone, in meantime school server 
figures arbitrary information as per security cryptographic 
calculation besides monitors registering outcomes. 

Thirdly, student telephone shows their arbitrary information 
on the label or tag while this get connected with the school 
server, afterwards label's interior computation, this outcome 
would be shown by the cell telephone lastly remain referred 
through the cellular telephone to check onto school server. 

At long last, the school server might be checked student 
information, which got commencing own particular 
preservation beforehand, in addition to retransmit the data "the 
study content is honest to goodness" onto cellular telephone 
on the off chance that they are steady. 

V. Conclusion 

RFID is an imminent programmed ID technique, being 
considered by numerous as a standout amongst the most 
pervasive registering advancements ever. RFID is in light of 
remotely recovering and putting away evidence operating 
appliances termed RFID transporters or RFID labels. A 
programmed distinguishing proof innovation, for example, an 
Auto-ID framework in light of RFID innovation is a vital 
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resource for learning frameworks for two reasons. As a matter 
of first importance, the perceivability gave by this innovation 
permits an exact learning of study level by wiping out the 
inconsistency between exam record and physical health. Also, 
RFID innovation can anticipate or diminish wellsprings of 
lapses. Advantages of utilizing RFID innovation incorporate 
the diminishment of work expenses, the improvement of 
school procedures and the lessening of learning mistakes. 

As of late, through wide expansion in the next era of IoT, 
incorporating RFID innovation besides the IoT. It's utilizing 
application for observing with hostile to duplicating onto 
student RFID's running network request to more utilization of 
the upside onto RFID innovation. These endeavors might 
accomplish genuine perception administration in RFID's. 
Hence, our nation undertakings ought to effectively advance 
the advancement procedure onto RFID innovation with IoT to 
request as sponsors for the administration of schools and 
vocational trainings. 
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Abstract. Human misbehaviors cause security systems breaches. One of 
the reasons behind this fact is neglecting human acceptance. For that 
reason, new technologies are usually faced with rejection or acceptance 
issues. Technology Acceptance Model (TAM) is one of the well-known 
models used to predict the acceptance of new technologies. Biometrics 
as an authentication direction is still under development. Relying on Bi- 
ometrics for authentication has some important characteristics; mainly, 
being faster and easier due to the fact that users will not be involved 
with unfamiliar interfaces, such as typing password, signing or even de- 
liberate exposing to some part of the body. This study investigates the 
users' intention to use biometrics as an authentication tool among 
young Arab people. A survey involving 74 individuals was conducted. 
The results reveal that perceived ease of use and perceived usefulness 
are significant drivers of the behavior of intention to use biometrics as 
an authentication tool. In addition, results show that perceived useful- 
ness is the most crucial factor in making a decision whether or not to 
adopt new technologies. 

Keywords: Intention to Use, Biometrics Technology, Authentication. 



1 Introduction 

In this digital world, we become computer slaves (Lao, 2005). While this makes 
life much easier, compromised security raises as an issues at high concern 
(Sukhai. 1998). Information overloading continues to increase due to the ex- 
pansion of applications that require authentication. For individuals, it is often 
difficult to remember the user names and PINs they rely on for authentication 
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purposes to their confidential data. Thus, many users select relatively easy 
passwords to remember (Coventry, 2003); this act is looked at as a security 
trade-off. Therefore, information security is in a serious need for more ad- 
vanced techniques that ultimately aim at improving its performance. Biomet- 
rics, as an option, brings good solutions for most authentication problems 
(Bala, 2008) and (Rashed, 2010a). There are three types of authentication ac- 
cording to (Boatwright, 2007), (Coventry, 2003) and (Jones, 2007): 

1. Information related to something an individual knows; for example a PIN or 
a password. 

2. Information related to something an individual has (i.e., posses); for exam- 
ple a passport, a smart card, a key or a cell-phone (Herzberg, 2003). 

3. Information related to something that uniquely identifies an individual (i.e., 
Biometrics); for example, fingerprints, signature, ear shape, odour, key- 
stroke, voice, finger geometry, iris, retina, DNA, and hand geometry (Gleni, 
2004) and (Prashanth, 2009). 

Using a PIN, also referred to as a password, is the most widespread technique 
(Skaff, 2007). In spite of its ease of use, relying on PINs has a critical observed 
vulnerability. This vulnerability comes as a result of the difficulties associated 
with the individual's capability to memorize several passwords/PINs. In addi- 
tion, user practices are very difficult to be policed (Rose, 1998). Therefore, rely- 
ing on biometrics rises to be the best solution or practice for authentication. 
On the one hand, users can uniquely authenticate themselves without being 
asked for PINs. On the other hand, users are not required to remember any 
piece of information in the authentication process (Coventry, 2003). This in 
turn makes users more comfortable (Sukhai. 1998). 

Biometrics as an authentication tool may appropriately fit as an authentica- 
tion tool in all sensitive organizations (Rashed, 2010b). However, user ac- 
ceptance is a concern when it comes to adopting biometrics for authentica- 
tion. Customer acceptance is highly critical as new technologies are prone to 
rejection in an unexpected way. For example, the first mechanical cash issuer 
was removed after six from its initial installation because it fell short in front of 
customer acceptance (Rashed, 2010c). As acceptance of technology is a mile- 
stone (Szajna, 1996), this study investigates and examines the intention to use 
biometrics as an authentication tool among young Arab people. 

The rest of the paper is organized as follows. In section 2 we overview the 
previous studies as literature review and address the problem statement. In 
section 3 we demonstrate our methodology and discussion. We conclude and 
present future work in section 4. 



26 



http://sites.google.com/site/ijcsis/ 
ISSN 1947-5500 



(IJCSIS) International Journal of Computer Science and Information Security, 
Vol. 13, No. 6, June 2015 



2 Literature 

Many researchers have validated TAM using different tools with regard to a 
variety of cultures. Chen et al. (2009) studied the determinants of consumer 
acceptance of virtual stores. Their results indicated that their proposed theo- 
retical model was sufficiently able to explain and predict consumer acceptance 
of virtual stores substantially. They presented a theoretical model that could 
explain a large portion of the factors that lead to a user's behavioral intention 
to use and actual use of a virtual store. Their model also could supply virtual 
stores with a number of operative critical success factors to remain competi- 
tive in the volatile electronic marketplace. 

Kripanont overviewed the literature concerning prominent theories and 
models of authentication and Information Technology (IT) acceptance. His 
thesis focused on internet usage behavior and behavior intention. IAM was 
supposed to explain and predict user behavior and might help practitioners to 
analyze the reasons for resistance to technology and also help them to take 
efficient measures to improve user acceptance and usage of the technology. 

Twati studied the cultural norms and beliefs within multi-national organi- 
zations in two regions. The first region covered Arab countries in North Africa 
(i.e., libya). The second region covered Arab countries in the Persian Gulf (i.e., 
Kuwait, Oman, Saudi Arabia, and United Arab Emirates). The results revealed 
that the two regions were not homogeneous. In addition, the study conveyed 
that age, gender, and education levels are factors contributing to the success 
of Management Information Systems (MIS) adoption in both regions. Fur- 
thermore, the study showed differences in organizational cultures that have 
impacts upon MIS adoption in both regions. The Persian Gulf region was 
dominated by an adhocracy culture that values the adoption of MIS, whereas 
the North Africa region was dominated by the hierarchy culture type that fa- 
vors a centralized management style, which negatively impacts MIS adoption. 
The Persian Gulf region did not show any significant effect of technology ac- 
ceptance variables. However, in the North Africa region, technology ac- 
ceptance played a vital role in MIS adoption. 

Rose and Straub studied technology acceptance in five Arab cultures: 
three Asian countries including Jordan, Saudi Arabia, and Lebanon; and two 
African countries including Egypt and Sudan. They examined the ease of use 
and perceptions of usefulness. Furthermore, they studied the role of the two 
factors in influencing actual usage and perceptions of usefulness to mediate 
the effect of perceptions of ease of use on actual usage. Their findings were 
consistent with the majority of TAM findings in the US. 

Ramayah et. al. examined the intention to use an online bill payment 
among part time MBA students in University Sciences Malaysia, Penang. They 
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developed and later modified the extended Technology Acceptance Model 
and Social Cognitive Theory to identify the factors that would determine and 
influence the intention to use an online bill payment system. They found that 
perceived ease of use and perceived usefulness are the significant drivers of 
intention to use the online bill payment system. They also found that subjec- 
tive norm, image and perceived ease of use were the key determinants of per- 
ceived usefulness whereas perceived risk was found to be negatively related 
to usefulness. Moreover, they found that computer self-efficacy played a sig- 
nificant role in influencing the perceived ease of use of the online bill payment 
system. 

Coventy et. al. (2003) addressed consumer-driven usability and user ac- 
ceptance of biometrics. They focused on finding out how iris can be used with 
Automatic Teller Machines (ATM) user interfaces. Their findings showed that 
90% of their study participants were satisfied with iris verification method and 
they would select it over signatures or PINs. 

Rashed et. al. (2010c and 2010d) wondered about the feasibility and fu- 
ture of odour authentication. They presented odour as a user authentication 
interface. They discussed its usage, advantages, disadvantages and user ac- 
ceptance as well. They applied and tested TAM on the Arab culture and their 
findings were consistent with previous studies (Ramayah, 2005). They con- 
cluded that it may be used in odour ATM (OTM) and they studied that in two 
different cultures. 

Rashed et al. (2010a) studied the importance of applying biometrics in the 
financial sector to overcome user problems (e.g., recalling PINs and carrying 
cards) and to insure information security. Their idea depends on using bio- 
metrics as an interface in ATMs. They presented their idea with challenges. 
They suggested replacing ATM machine by OTM machines. They concluded its 
capacity to user acceptance and called for more researches in this field. 

Using biometrics as an authentication tool may not be expected by users. 
The biometric technologies create the challenge of avoiding attacks before 
they take place (Rashed ,2010b). We think that the problem resides in how we 
could present the biometrics in a form that overcomes the worries about users 
expectations. 

3 Methodology and Discussion 

Seventy four printed questionnaires were collected from the study partici- 
pants. Our sample consisted of eighty four respondents. The main findings 
can be summarized as follow: 
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Table 1. Sample Profile 



Variable 




Frequen- 
cy 


Percentage 


Age 


21-30 


50 


0.68 




31-40 


18 


0.24 




41-50 


4 


0.05 




51-60 


2 


0.03 


Specialization 


rr 


22 


0.30 




Social Sc. 


8 


0.11 




Engineering 


16 


0.22 




Others 


28 


0.38 




Student 


1 


0.01 


Education level 


Secondary School 


8 


0.11 




BSc 


49 


0.66 




College 


10 


0.14 




MSc 


5 


0.07 




Ph.D. 


1 


0.01 



• Table 1 shows that the majority of the questionnaire respond- 
ents were within the age interval [21-30] which represents 
young people with 68%. From the same table we can see that 
30% of the respondents were IT specialists. Moreover, the ta- 
ble shows that most of the respondents were B.Sc. holders. 

• 19% of the questionnaire respondents did not like the idea of 
using biometrics in authentication, whereas 47% of them liked 
the idea, 23% did not decide and 0.01%. 

• 69% found biometrics as an authentication system would im- 
prove their efficiency and effectiveness in life. Performance 
and 77% found it would enhance their productivity in life. 

• The majority, representing 58% of the questionnaire respond- 
ents found it easy to use biometrics as authentication system. 

• The majority, representing 54%, of the questionnaire respond- 
ents indicated that they would frequently use this type of au- 
thentication technique if it were available. Most of the re- 
spondents who intended to use this technique were young 
people. Table 2 shows that there is a strong relationship be- 
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tween age and intention to use. Young people have a strong 
attitude to accept new biometric interface. 

• 38% of the respondents showed willingness to use biometrics 
as authentication system, whereas 14% confirmed that they 
would not use if it was available. Most of the respondents, 
representing 49%, did not decide (i.e. they were not certain). 



Table 2. Stastistical Analysis of the Study 





DF 


SS 


MS 


F 


Signifi- 
cance F 








Regres- 
sion 


1 


800.33333 
33 


800.33 


21.438 


0.04361792 
9 








Residual 


2 


74.666666 
67 


37.333 












Total 


3 


875 


































Coeffi- 
cients 


Standard 
Error 


t Stat 


P- value 


Lower95% 


Upper 

95% 


Lower 
95.0% 


Upper 
95.0% 


Inter- 
cept 


55.466666 
67 


4.9387357 
81 


11.231 


0.0078 


34.2170016 
8 


76.71633 
165 


34.2 


76.716 

33 


X Varia- 
ble 1 


6.533333333 


1.4110673 
66 


-4.6301 


0.0436 


-12.6046662 


0.46200048 


-12.6 


-0.462 



Figure 1 that illustrates our proposed model shows that both perceived use- 
fulness and perceived ease of use are significant drivers for the intension to 
use new technologies. Thus, both perceived effectiveness and performance, 
pillars of perceived usefulness, play an important role to intend to use new 
technologies. 
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Fig. 1. Applying TAM on Biometrics 



4 Conclusions 

We distributed a bilingual questionnaire to study the potential of accepting 
biometrics as an authentication tool. Respondents to this study found it a 
good idea and indicated an intention to use it in the future if it happened to 
be available. Obviously, our findings confirm the previous results. Results re- 
veal that perceived ease of use and perceived of usefulness are significant 
drivers of the behavior of intention to use biometrics as an authentication 
tool. In addition, this study results revealed perceived usefulness to be the 
most crucial factor in the decision to adopt new technologies. According to 
this study, security remained as a significant factor to affect the behavior of 
users. Moreover, we found a tight relationship between acceptance and age; 
young people showed more apatite to accept new biometrics interface. 

Presenting the underlying concept in an acceptable form would acceler- 
ate the acceptance and adoption of this tool. This would raise the user's con- 
cerns about this approach security levels. Many users thought that hacking 
this approach would be easy and thus it needs to be strengthened by another 
supplemental approach for enhancing the overall performance. In addition, 
biometric data can be stored in a smart card that owns a microprocessor and 
micro biometrics sensor. Micro-sensor can obtain the data from the user and 
directly communicate with machines to authenticate the card holder. The card 
should have a storage space for storing biometrics data (i.e., encrypted digit- 
ized format stored in the card). 
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Abstract — Vikram samwat Gujarati Calendar is the well known 
and ancient calendar used by Gujarati 's in India which is 
following the time period of the successive return of the moon in 
conjunction or opposition to the sun in relation to the earth. The 
data mining technique retrieves the knowledge from the data 
without any pre hypothesis. This research is to apply computer 
intelligence to analyze the association of one of the weather 
parameter temperature according to this calendar using 
temporal association rule mining. The experiment result proves 
that there exist the special associations between weather 
parameters and this calendar which can provide new insight to 
the researchers of this area and does not require any extra 
expertise in weather. 

Keywords- Temporal association rule mining; weather 
prediction; Gujarati tithi 

I. Introduction 

The Gujarati Hindu calendar is the ancient calendar 
prepared mainly by considering the sun, moon and earth 
position. This relative position is the main source of day and 
night and season on the earth [1]. To predict the temperature, 
the weather forecasters use the history of the weather 
parameters, current status of various parameters received by 
satellite or instruments and different complex models 
simulation. Not only these, but whatsoever the result is 
generated they have to apply their expertise to provide the final 
prediction [2], [3], [4]. 

Data Mining does not require any prior knowledge and 
provides techniques to discover interesting patterns from large 
amounts of data from databases, data warehouses, or other 
information repositories. It is an interdisciplinary field, mining 
knowledge from all the different areas like statistics, machine 
learning, data visualization, information retrieval, high- 
performance computing, neural networks, pattern recognition, 
spatial data analysis, image databases, signal processing, and 
from many application fields, such as business, economics and 
bioinformatics [5], [6], [7]. These days it is utilized in weather 
forecasting also using temporal data. Temporal association rule 
mining is the one of the area of data mining which discovers 
the associations from the time stamped data. Association rule 
mining is nowadays used in the area of prediction [8]. 



Here, as mentioned earlier due to the importance of 
Gujarati Hindu calendar this research is to contribute the 
analysis of temporal association rule mining using "tithis" to 
derive novel associations of weather parameters with the 
"tithis". The next section 2 is describing the Gujarati Hindu 
calendar, the section 3 is reviewing the association rule mining 
utilized in the environment forecasting. The next section 4 
proposes the framework to discover the association between 
the temperature and "tithis". The next section 5 discusses the 
achieved result followed by the conclusion and future scope. 

II. Gujarati Calendar Tithi 

In India, the Gujarati Hindu (Vikram Samwat) calendar is 
the most ancient calendar and part of Gujarati 's life to identify 
promising days and holy schedules. Other than this, to deal 
with the global world English calendar known as Gregorian 
calendar is followed [1]. The Gujarati Hindu calendar is 
following the time period of the successive return of the moon 
in conjunction or opposition to the sun in relation to the earth. 
This is the time period from new moon to new moon, or full 
moon to full moon, measured as the lunar month. So, in this 
calendar, months are as per the moon and days as per both the 
sun and the moon. Lunar days or "tithis" can have various 
lengths of hours. But sometimes a "tithi" is absent or sometime 
two continuous days share the same "tithi". This is because in 
Gujarati Calendar the days are calculated using the difference 
of the longitudinal angle between the position of the sun and 
moon. 

This Gujarati Hindu calendar is according to the lunar year 
consists of 12 months. Two fortnights are coming in a lunar 
month that begins with the new moon called "amavasya". Each 
lunar month has 30 tithis of 20 - 27 hours. During the 
wax/bright phases of moon, tithi is identified as "Shukla", 
beginning with the full moon night called "purnima" also 
known as auspicious fortnight. During the diminishing phases 
of the moon tithi is identified as "Krishna" or "Vad" or the dark 
phase, which is also known as the inauspicious fortnight [1]. 
Here, in India, in general there are three seasons like winter, 
summer and monsoon. According to this calendar seasons are 
as per the sun position. 

This relative position of sun, moon and earth relation 
motivates us to analyze their relation on the environment 
weather. Very next section is illustrating the usage of data 
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mining technique mainly association rule mining for the 
weather environment. 

III. Association Rule Mining and its Environment 
Application 

Association rules are an important and a fundamental data 
mining task. The objective is to find all co-occurrence 
relationships, called associations, among data. It has attracted a 
great deal of attention and extensively studied by the database 
and data mining community [5]. Many efficient algorithms, 
extensions and applications have been reported. 

Most of the data analysis methods are based on 
classification or clustering algorithms to categorize the data to 
the specific group or to establish groups of correlated data 
respectively. These algorithms are quite winning but, they have 
some limitations like a data record has to be grouped in one and 
only one group and no relationship or association can be 
inferred between the different members of a group. 

The association rule mining overcomes such problems. This 
is an unsupervised data mining technique that discovers 
descriptive rules from very large datasets. This technique has 
many merits like any data item can be assigned to any number 
of rules as long as its expression fulfills the assignment criteria, 
without limitation. And rules are orientated (If . . . then . . .) and 
thus to a certain extent describe the direction of a relationship. 
Last but not the least, by focusing on strong rules, the decision 
maker does not have to browse and study a huge number of 
redundant rules. 

The strength of a rule is measured by thresholds support 
and confidence [6], [7]. The support of a rule, X Y, is the 
percentage of transactions in T that contains X u Y, and can be 
seen as an estimate of the probability, Pr(X u Y). Let n be the 
number of transactions in T, then the support of the rule is 
computed as follows: 

Support = (Xu Y).Count/n (1) 

The confidence of a rule, X Y, is the percentage of 
transactions in T that contain X also contain Y. It can be seen 
as an estimate of the conditional probability, Pr(Y I X). It is 
computed as follows: 

Confidence = (Xu Y). Count / X.Count (2) 

For the given a transaction set T, the objective of the 
association rule is to discover all association rules in T that 
have support and confidence greater than or equal to the user- 
specified minimum support and minimum confidence. 

The application area of association rule mining is very huge 
and used in various areas like Marketing and Sales [5], 
Documents / Text, Bioinformatics [6] and Web Server [7]. 

Weather forecasters predict weather mainly from numerical 
and statistical models simulation which requires intensive 
computations, complex differential equations and 
computational algorithms where the accuracy is bound by 



constraints, such as the adoption of incomplete boundary 
conditions, model assumptions and numerical instabilities, etc. 
[9]. 

The variety of environmental applications and its data of 
type multi disciplinary, multi-sensor, multi- spectral, multi- 
resolution, spatio-temporal, high-dimensional provide a rich 
platform for the practice of data mining [2]. It is also helpful to 
decision maker or non-computation person for the 
environmental data as folio wings: 

• The author presented the method for prediction of daily 
rainfall from meteorological data from the atmospheric 
parameters temperature, dew point, wind speed, visibility 
and precipitation (rainfall) of 1961-2010. They applied 
basic Apriori algorithm of association rule mining to 
predict the rainfall [10]. 

• Association rule mining is used to identify rules that 
indicate the relations between atmospheric parameters like 
day, time, year, temperature, pressure, humidity, etc. and air 
pollutant data like date, time, CH 4 , CO, C0 2 [11]. 

• The author derived the close relationship between 
environmental factors and ecological events the Red tide 
phenomena occurred during 1991 and 1992 in Dapeng bay, 
South China Sea using temporal association rules and K- 
means clustering analysis on the time, sea water temp, 
salinity, DOxygen, pH, etc. [12]. 

• The relationships between the trajectories of Mesoscale 
Convective System (MCS) called thunderstorm and the 
environmental physical field values are analyzed using 
spatial association rule mining technique to predict the 
heavy rain fall [13]. 

• To discover the weather for the specified region, patterns of 
similar region weathers for British Columbia were analyzed 
using association rule with the data like temperature, 
precipitation, wind velocity, etc. [14]. 

• Due to the increasing number of earthquakes, tornados and 
Tsunami waves, the incremental mining of association rules 
used to discover the shocking patterns at current time with 
respect to the previously discovered patterns rather than 
exhaustively discovering all patterns of the earthquakes 
[15]. 

• The author analyzed historic salinity-temperature data to 
make predictions about future variations in the ocean 
salinity and temperature relations in the waters surrounding 
Taiwan using inter-dimensional association rules mining 
with fuzzy inference with spatial-temporal relationships 
where traditional statistical models fails to relate spatial and 
temporal information [16]. 

• The author tried to extract useful knowledge from weather 
daily historical data at Gaza Strip city by applying basic 
algorithms of clustering, classification and association rules 
mining techniques to know the importance of them in 
meteorological field to obtain useful prediction and support 
the decision making for different sectors [17]. 
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• The author analyzed the usage of association rule for 
discovering the relationships between stream flow and 
climatic variables in the Kizilirmak River Basin in Turkey 
[18]. 

As discussed above the number of applications is dealing 
with association rule, it justifies the study of association rule 
mining for the traditional and special applications which deals 
with environmental data and here we are discussing them with 
the Gujarati Hindu calendar as upto the knowledge of author it 
is not utilized which is novel. 

IV. Proposed Approach 

The system is proposed to utilize temporal association rule 
mining to associate temperature with "tithis". The proposed 
system is as shown in the following Figure 1 . For the temporal 
association rule mining, used is intertransactional association 
rule mining to mine the association across the transactions 
instead the intra transactions. 

The proposed system steps are as follows: 




Figure 1 . Proposed System 



The main part of the system is the data preparation. In the 
first step, the weather data is collected consisted of 
Temperature, Sea level Pressure, Dew Point, Humidity, Wind 
Speed, Visibility and Precipitation. The author found from their 
analysis that instead of yearly data, the predictability is 
efficient if the data is prepared according to seasons [19]. So, in 



data aggregation part data is separated and aggregated as per 
three seasons. Then discretization is applied on this data. From 
the feature selection, only selected features data are kept for the 
further processing. Now, here "tithi" data are included with the 
transaction data and prepared the mega transactions to discover 
the associations not within the "tithi" but among the "tithis". 
Next part is to separate the data for the training and testing 
data. Temporal association rule mining is applied using Apriori 
algorithm from training data and Rules model is prepared to 
test the data for the "tithi" association with temperature. 

V. Experimental Analysis 

We evaluated the temporal association rules generated from 
real databases. The data is collected from the weather website 
http://wunderground.com-/history/station/42840 for the Surat, 
India station. We aimed to discover rules that demonstrate the 
association between temperature and "tithi" that can be used 
for prediction of temperature. 

The data is collected for the three years from 16th Feb 2008 
to 15th Feb 2011. For these days, "Tithi" information is 
collected from http://melbourne-jainsangh.org/useful- 
links/activity-tithi-calendar/. From these seasons information 
data are separated and aggregated for these three years together 
and mega transaction information is prepared with the help of 
"tithis". The system is tested with Support=20% and 
Confidence=20%. From the megatransactions information 70% 
training data is taken and 30% data is used for testing. 
Generated temporal association rules from the training data is 
applied to the testing data for the future day prediction and 
achieved 61% accuracy for the summer and monsoon seasons 
with integration of "tithis" to the parameters. Sample of 
generated temporal association rules are as shown in the 
following Table 1 . 

TABLE I. Sample Rules with Discretized Interval Values for 
Summer Season 

Rule 0: [25.5-26.5)Sud3 -> [25.5-26.5)Sud4 S=0.04C=0.21 
Rule 1: [29.5-30.5)Vad3 -> [29.5-30.5)Vad4 S=0.09C=0.51 
Rule 2: [28.5-29.5)Vad3 -> [28.5-29.5)Vad4 S=0.12 C=0.36 
Rule 3: [27.5-28.5)Sudl5 -> [28.5-29.5)Vadl S=0.18 C=0.53 
Rule 4: [28.5-27.5)Vadl5 -> [26.5-27.5)Sudl S=0.02 C=0.31 



The outcome of the temporal association rules for the 
summer season is as follows: 

The Rule 0 says that in "Sud tithis", the temperature value 
stays in low range. 

The Rule 1 and Rule 2 say that in Vad tithis, the 
temperature above the average is stay in the same temperature 
range. 

The Rule 3 says that if "Sud tithi" is changing to "Vad 
tithi" then temperature is increasing. But, here if Wind Speed 
increases then there will be no change in temperature. 

The Rule 4 describes that "Vad tithi" is changing to "Sud 
tithi" then the temperature is decreasing. 
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According to these numbers of other rules are generated 
and can help in analysis of weather further. 

VI. Conclusion 

The integration of "tithi" with weather parameters can 
provide fruitful information to forecast the weather parameter 
temperature. The system is uncomplicated compare to the other 
complex weather forecasting system. Experiment results show 
that these rules model can generate automatic result which does 
not require extra proficiency in the weather forecasting area 
and complex models. In future like to examine the feasibility of 
this integration of "tithi" with other seasons and with the other 
data mining techniques together. 

Acknowledgments 

This research work is carried out under the research project 
grant for SVNIT Assistant Professors' bearing circular number: 
Dean(R&C)/1503/2013-14. Herewith is acknowledging the 
institute SVNIT. 

References 

[1] A. Doegar, A. Prasad, and H. Aslaksen, "Indian Calendars", Theses, 
Research in Mathematics (MA3 288), 2003. 

[2] J. Spate, K. Gibert, E. Miquel, J. Frank, J. Comas, I. Athanasiadis, and 
R. Letcher, "Data Mining as a Tool for Environmental Scientists", 
Elsevier Book Series for Developments in Integrated Environmental 
Assessment, 3, pp. 205-228. 

[3] E. Ikoma, K. Taniguchi, T. Koike, and M. Kitsuregawa, "Development 
of a data mining application for huge scale earth environmental data 
archives", Intl. J. of Computational Science and Engineering, 
2(5/6), 2006, pp. 262-270. 

[4] S. Dzeroski, "Environmental Applications of Data Mining", Lecture 
Notes of Knowledge Technologies, University of Trento. 

[5] J. Han, and M. Kamber, Data Mining: Concepts and Techniques, 2nd ed. 
Morgan Kaufmann, 2006. 

[6] M. H. Dunham, Data Mining Introductory and Advanced Topics, India- 
Pearson Education, Reprint, 2005. 

[7] B. Liu, Association Rules and Sequential Pattern in Web data mining: 

exploring hyperlinks, contents and usage data. Berlin Heidelberg: 

Springer-Verlag, 2007, pp. 13-54. 
[8] J. Deogun, and L. Jiang, "Prediction Mining - An Approach to Mining 

Association Rules for Prediction", RSFDGrC 2005, Springer-Verlag 

Berlin Heidelberg, LNAI 3642, 2005, pp. 98-108. 

[9] F. Olaiya, "Application of Data Mining Techniques in Weather 
Prediction and Climate Change Studies", International Journal of 
Information Engineering and Electronic Business, vol. 1, 2012, pp. 51- 
59. 



[10] T. R. Sivaramakrishnan, and S. Meganathan, "Association Rule Mining 
and Classifier Approach for Quantitative Spot Rainfall Prediction", 
Journal of Theoretical and Applied Information Technology, vol. 34, 
Issue. 2, 2011. 

[11] E. Parvinnia, "The application of association rule mining a case study: 
The effect of atmospheric parameters on air pollution", Intl. Conf. 
Applied Computing, ISBN:978-972-8924-30-0, 2007, pp. 745-748. 

[12] Z. Liang, T. Xinming,and J. Wenliang, "Temporal Association Rule 
Mining Based On T-Apriori Algorithm and its Typical Application", 
Intl. Symposium on Spatial-Temporal Modeling Analysis, vol. 5, Issue. 
2,2005. 

[13] Z. Guo, X. Dai, and H. Lin, "Application of Association Rule in Disaster 
Weather Forecasting Annals of GIS", Intl. Association of Chinese 
Professionals in Geographic Information Science (CPGIS), vol. 10, No. 
1,2004, pp. 68-72. 

[14] K. Koperski, and J. Han, "Discovery of spatial association rules in 
geographic information databases", in the 4th Intl Symposium on large 
spatial databases (SSD95), Maine, USA, 1995, pp. 47-66. 

[15] E. Yafi, A. S. Al-hegami, M. A. Alam, and R. Biswas, "Incremental 
Mining of Shocking Association Patterns," Engineering and 
Technology, 2009, pp. 801-805. 

[16] Y. P. Huang, L. J. Kao, and F. E. Sandnes, "Predicting Ocean Salinity 
and Temperature Variations Using Data Mining and Fuzzy Inference", 
Intl. Journal of Fuzzy Systems, vol. 9, Issue. 3, 2007. 

[17] S. N. Kohail, A. M. El-Halees, "Implementation of Data Mining 
Techniques for Meteorological Data Analysis", Intl. Journal of 
Information and Communication Technology Research (JICT), vol. 1, 
Issue. 3, 2011. 

[18] F. Dadaser-Celik, M. Celik, and A. S. Dokuz, "Associations Between 
Stream Flow and Climatic Variables at Kizilirmak River Basin in 
Turkey", Global NEST Journal, vol. 14 (3), 2012, pp. 354-361. 

[19] D. P. Rana, N. J. Mistry, and M. M. Raghuwanshi, "Temporal 
Association Rule Mining Analysis For Days Temperature", International 
Journal Elixir Computer Science and Engineering, vol. 62, 2013, pp. 
17728-17734. 

AUTHORS PROFILE 
D. P. Rana is Assistant Professor at Computer Engineering Department, S. V. 
National Institute of Technology, Surat, Gujarat, India and is currently 
pursuing her PhD degree. Her research interest is in the field of security in 
web applications, computer architecture, database management system, data 
mining and web data mining. She is a life member of ISTE and CSI. 

P. Chaudhari has completed her M. Tech. in Computer Engineering at S. V. 
National Institute of Technology, Surat, Gujarat-395007, India. 

N. J. Mistry is Professor at Civil Engineering Department, S. V. National 
Institute of Technology, Surat, Gujarat-395007, India. He is a member of 
CES. 

M. M. Raghuwanshi is working as a principal at Rajiv Gandhi College of 
Engineering and Research, Nagpur, India. He completed his PhD in Computer 
Science, 2007, at VNIT, Nagpur, India. He is a member of IEEE, ISTE and 
CSI. 



37 



http://sites.google.com/site/ijcsis/ 
ISSN 1947-5500 



(IJCSIS) International Journal of Computer Science and Information Security, 
Vol. 13, No. 6, June 2015 



A Distinct Technique for Facial Sketch to Image 

Conversion 



Prof. Prashant Dahiwale 
Dept. of Computer Science & Engineering 
Rajiv Gandhi College of Engineering 
&Research, Wanadongri. Nagpur, India. 
prashant.dahiwale @ gmail.com 



Madhura S. Bombatkar 
Dept. of Computer Science & Engineering 

Rajiv Gandhi College of Engineering 
&Research, Wanadongri. Nagpur, India. 
bombatkarmadhura @ gmail.com 



Dr.M.M.Raghuwanshi 
Dept. of Computer Science & Engineering 
Rajiv Gandhi College of Engineering 
&Research, Wanadongri. Nagpur, India 
raghuwanshimm@ gmail.com 



Abstract — A liberal amount of software applications are 
in market for generating a sketch out of an image, the vice- 
versa though is unacquainted. Whereas such an 
implementation will prove to be purposive to the crime 
investigation departments. Such a youthful approach for 
generating an image from a sketch is suggested in this 
paper by following a process of, breaking down the sketch 
into constituent or component of face, matching or 
comparing these features with the available database, 
selecting the best match followed by registering or pasting 
these image components on a blank face image, 
performing filtering algorithm in order to perform 
smoothening of image. 

Index Terms — Feature detection, feature extraction, facial 
components, filtering algorithms, fiducial points, 
smoothening image. 

I. Introduction 

A liberal amount of software's and applications are available 
to convert a image to a sketch and are well known the vice- 
versa tough is not yet induced, that is there exists no 
methodology that would support the conversion of a sketch to 
an image. 

The paper presents a layout for a similar idea, foregoing on 
which this plan is distributed into four parts Detection and 
Extraction of features, Matching of features, Registering 
features to form image, Smoothening and finish to form an 
image. Using the developed technique for Detection and 
Extraction of features, using the data obtained from the same 
performing matching of features. The input will consist of 
matched features which will be pasted on the face mask so as 
to obtain the desired output image. 

Detecting features is the objective of the first module 
where facial features are detected and extracted, a database of 
which is generated for further use. The second module 
demands a facility of being able to compare two inputs which 
are a sketch and its feature on a variant platform the approach 



takes a path of conversion of both the inputs to a similar 
platform say black and white and then performing feature 
matching algorithms like PCA Algorithm. The execution of 
this algorithm is performed on the database and the test image 
on which conversion is to be performed, automatically without 
providing separate compatible image to perform matching 
multiple times with different image inputs. Module three leads 
to extracting desired features of input sketch and matching 
them with image equivalents from the database registering all 
the components together forming an output. Concluding 
module performs image smoothening algorithm for giving a 
finishing to the output obtained. 

II. LITERATURE SURVEY 

A detailed approach on various techniques of merging images 
is presented. The reference [1] provides various unswerving 
methods for achieving the objective are introduced along with 
their result giving capacities. On the basis of which analysis 
report is also deployed in the given paper. The authors are 
focusing on a smoothly finished image that is obtained by 
merging few other images. 

The basics of recognizing the similarities between two faces 
are denoted in reference [2]; the approach used is based on 
facial expressions that are beneficial to our project from the 
point that we consider the facial features. The expressions 
used as a distinguishing point in this paper are the similar 
aspect we intend to use in our project. Thus the identification 
of features is taken in consideration and thoroughly observed 
from this paper. 

Pattern recognition and face recognition is the main objective 
in reference [3] , which not only introduce variant 
methodology for recognizing a face and producing result in 
form of acceptance and rejection but also gives a determined 
percentage of the face match. No limiting is done for 
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displaying the percentage thus every input is considered as a 
valid input and thus a valid output. 

The CMU pose and illumination is a base of the 3D imaging 
where the expressions are identified and compared using a 
three dimensional aspect which did not prove to be much of 
use with respect to our project point of view but the 
identification methodologies used can be replicate by simply 
avoiding the three dimensional part. Reference [4] introduces 
comparison method that produces highly dependable results 
and thus can be useful. 



Feature identification and recognition methodologies with an 
improvisation that it also presents the identification of these 
facial expressions or features or faces in first place even in 
disguise. Reference [5] enlightens any temperament to the 
original face image can be identified separately and 
considered and avoided according to the input and the 
recognition is done. 

In order to imply a combination of nonlinear diffusion and 
bilateral filtering refining image and edge detection technique 
is proposed. Citation of two well established methodologies in 
image processing community is done in order to get a base to 
the model, which makes understanding and implementing the 
method very easy. Execution of numerical experiments 
exhibits that the proposed model can achieve more accurate 
reconstructions from noisy images, in comparison to other 
popular nonlinear diffusion models in the literature. Reference 
[6] briefs a diffusion stopping criterion, established from the 
second derivative of the correlation between the noisy image 
and the filtered image which can be introduced as new and 
simple. Prevention of the diffusion process is done by this 
indirect measure that depicts a close to the point of maximum 
correlation between the noise-free image and the reconstructed 
image, when the former is removed. The stopping criterion is 
sufficiently general to be applied with most nonlinear 
diffusion methods normally used for image noise removal. 

Literature survey of methodologies of face matching and 
feature matching is done in this paper. All present techniques 
for the same are studied and a detailed analysis of the same is 
presented in reference [7]. Analyzing approach is based on the 
study of all these techniques under the similar databases and 
inputs such that the obtained outputs are visually identified to 
be similar or not and to what extent documents represented as 
vectors. 



III. PROPOSED WORK 

A simplified methodology that proposes the conversion of 
sketch into image with an appropriate approach such that the 
originality of all the features is retained. The basic approach is 
to identify the prominent features of a face and then searching 
for an appropriate or equivalent image equivalent of the same, 



then by replacing the image equivalent of the feature on the 
face mask performing proper pasting and smoothening such 
that the image looks genuine.. 



Input Sketch 

Using an Algorithm 



Database 



Searching and comparingtechntques 



\ 



Breakdown into several 
componentsofface 



Component Matching and 
Selecting 

(Using an Algorithm) 



Components:- j 

•Shape of face 
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/ Blank \ 
A Mask j 



Registration on blank 
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U sing Filtering or Smoothing Algorithm 



Output Grayscale 
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Fig. I - Architecture of proposed method 



A. Method of Data Collection 

Standard image and its equivalent sketch 
database are collected from authenticated collection. The 
database collected consists of more than fifty sketches and 
its equivalent images. The sketches will be the test inputs, 
all these sketches need to be of specific dimensions or 
precised size. 

B. Preprocessing 

Database collected from authenticated 
database collections is processed to obtain a database of 
components of face or facial features of only the image 
format, as the processing on sketch is done during 
execution of the code. Separate collection of these 
features is stored and retained for use during the code 
execution. The various feature database consist of the 
following, 

• Eye database. 

• Nose Database. 

• Mouth Database. 

• Blank Face Database. 
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C. Methodology 

The entire process is summarized into four 
modules which are elaborated below the modules namely 
are as stated, 

• Feature Detection and Extraction. 

• Feature Matching. 

• Registering image equivalents. 

• Smoothening and finishing output 

D. Feature Detection and Extraction 

When an input is passed to the 
method the primary task is to use feature detection 
methodologies and detect the prominent facial features. G In 
order to generate patterns from time series data for 
classification purposes several feature extraction methods 
have been introduced. A statistical measure of the amplitude 
of the time series is provided by the well known kurtosis 
method. Another method displays construction of a feature 
vector using the spectrum, where the power spectral density 
and the wavelet coefficients are used along with PCA for 
feature extraction. 





Fig. II - Feature detection method 



For extracting the phase information Hilbert transform 
requires conversion of the real-valued signal into complex- 
valued analytic signal. Time series data is predefined in the 
SDF-based feature extraction, which is first converted into 
symbol sequences, and then probabilistic finite-state automata 
are constructed from these symbol equines for compressing 
the pertinent information into low-dimensional statistical 
patterns. DF-based feature extraction from (wavelet- 
transformed) time series has been proposed by Jin et al. (2012) 
for target detection and classification in border regions. The 
time-frequency localization and demonizing of the underlying 
sensor time series leads for the use of rationale wavelet- based 
methods. However, this method requires selection and tuning 
of several parameters (e.g., wavelet basis function and scales) 
for signal pre-processing in addition to the size of the symbol 
alphabet that is needed for SDF. Use of Cascade object 
detector is done in order to detect and extract the features. 




Fig. Ill - Feature extraction method. 



E. Feature matching 

Application of principal component analysis is 
done on each image by the Eigen Object Recognizer class, the 
results of which will be an array of Eigen values which can be 
recognized by a Neural Network which is trained. PCA is a 
frequently used method of object recognition as its results, can 
be fairly accurate and resilient to noise. 



■ 



Fig. IV - Feature matching method 



The method of which PCA is applied can vary at 
different stages so what will be demonstrated is a clear method 
for PCA application that can be followed. It is up for 
individuals to experiment in finding the best method for 
producing accurate results from PCA. To perform PCA 
several steps are undertaken: 

• Set extracted feature as test image. 

• Consider train database of particular feature. 

• Perform PCA detect output. 

• Display output. 

F. Registering Features and Smoothening Image 

The previous mode of feature matching 
provides several image outputs viz. Eyes, nose, mouth, blank 
face. The objective now switches to registering all these 
components together in proper dimensions at proper location. 
In order to detect exact location of every feature their original 
landmarks are revised from the input sketch, this simplifies the 
task of dimensional repositioning of the features. Facial points 
are detected to register the location of every component 
specifically at its precise dimensions. 
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Fig. V - Approach for detecting location of components. 



The outcome of all will be a patched form of image 
which will have all the image components in their appropriate 
locations but the facial appearance of the image may not be as 
pleasant as expected. Resizing the components and realigning 
them is thus a necessity, for which application of certain 
filtering and smoothening algorithms is conducted. 




Fig. VI - Approach for registering of features 



The output is finalized by performing gradient smoothening 
on the image and image blending algorithms are executed for 
obtaining a perfect outcome, the output is a visually pleasant 
image form as shown below, 



■ 




Input Sketch Output Image 



Fig. VII - Final outcome of image formation. 
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Abstract — The present work shows in its introduction to the 
importance of information security in the current environment of 
digital culture, especially after the occurred on September 11, 2001 
in the United States. The subject involves not only information 
technology-related companies but can verify this concern in the 
daily life of the companies and therefore specific laws 
Governments. This can be verified in building distributed systems 
(including operating systems and managerial), in the 
infrastructure of networks of companies and organizations and 
web sites. This study analyzes the mechanism of the servers of 
Internet pages because many attacks exploit these vulnerabilities. 
Programming of web sites (mainly dynamic content) can also be 
used to circumvent the security and enable an occurrence of illegal 
access. Programmers should note some important features to 
avoid the predatory action of invaders, because no one can build 
web sites without taking into account the hosting and the creation 
of source code which is intended to reduce the vulnerability of the 
system to a minimum acceptable. Finally, comments on the ten 
most common types of vulnerabilities to be observed when making 
web sites according to the OWASP (The Open Web Application 
Security Project) aims to create awareness about security in 
programming sites. 

Keywords-Security, information, network infrastructure, 
distributed systems. 



I. Introduction 

Acquainted with the need to maintain the Confidentiality, 
Availability and Information Integrity which is processed on the 
websites of their clients, the networks professional (obviously 
including the Internet) need to get acquainted with the 
functioning of the requests and responses that interact with the 
client software (browsers) and Web servers. 

The code of this Web Server is written in the Java language, 
initially developed by the Sun Microsystems company which 
was acquired by Oracle. 

The source code of this server (Daswani; Kern; Kesavan, 
2007)[1] is in the Annex I. 

The methodology on this paper is to compile and run this 
local server and access it through a browser program (eg 
Microsoft Internet Explorer) which requests HTML pages 
receiving and displaying them as if they were connected to the 



Internet or to an Intranet; inasmuch as the computer to be held 
this process has no link with another computer. 

Thus, the address of the requested page will be initially 
http://localhost:8080/index.html. The explanation for this URL 
is as follows: http is the Internet protocol with which the WWW 
service works (World Wide Web). The term "localhost" 
indicates that the server is local, thus exempting external 
connection. 

The communication port used for this purpose is 
8080, which is informed immediately after localhost and 
separated from it by ":" (colon). It is common the use of this 
same port for Web applications (eg Apache Tomcat, which is a 
container for Java Web applications). 

This example shows that the web server will only process the 
requests GET type. When the address of a page is entered 
directly into a browser, it starts to search for the IP address 
through page servers (by DNS table - Domain Name Service). 

When the server that responds and hosts the page is found, 
the browser sends a request GET type and informs the file name 
(which can be static - .html - or dynamic - .php, .asp or .jsp for 
example). Given the software presented here, it does not 
consider other request since the focus of this paper is the safe 
development of Internet pages. One of the security measures is 
to prevent a user try to execute malicious code within the hosted 
site. This type of threat is done by entering the beginning of the 
URL address, followed by a parameter that points to another 
page (with this malicious code). 

For example: 
http ://www.meusite. com.br ?pag= www.invadir.jsp. 

This happens when the site above uses parameters to call 
internal pages which will fill frames or divs; and instead of 
calling a file from the appropriated server that hosts the URL it 
ends up pointing another page from another server (which 
belongs to the hacker). Thus, a security breach occurs. 

On lines 56-57 of the present server code, a block of the try 
type is created (try {...} catch (...}) trying to read the requested 
file (FileReader(pathname)) and, if an exception occurs, it is 
handled in the catch block which sends to the client browser the 
following message: "HTTP/1.0 404 Not Found\n\n", which will 
display the error 404, meaning that the page was not found 
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because the server searched on the site and the file is not there 
physically. 

The exception of the Java language as concerned is 
FileNotFound. However, other might occur, for example, not be 
possible to read the file. To cover all these and other cases, the 
catch command (Exception e) brings written "Exception" which 
is the superclass, mother of all exceptions, thus accepting to treat 
anyone and not only FileNotFound. If the developer wanted to 
specify to the final user what the problem was, he could test 
every problem, but generally in the web environment, the default 
message is displayed that the browser expects. 

In addition, little good would be done (and would be even 
dangerous, for security) to show to the Internet user, the internal 
server error structure. 

Still observed in this code was that if the final user just types 
in the address bar http://www.meusite.com.br, the server notices 
that a specific page was not requested and shows the index.html. 

Another vulnerability that the server cannot allow is the user 
to type "../../../../etc/shadow" instead of the inside page, This way 
generating the browser GET request ../../../. ./etc/shadow HTTP 
/ 1 .0 that will show the machine's passwords file which is hosting 
the site, if it is the Linux or Unix standard. A caution that can be 
exerted is not allowing users to read the shadow file. 

II. THE DEVELOPMENT ENVIRONMENT - 
PROGRAMMING 



On the programmer stand, there are precautions to be taken 
and Frame Works to be implemented to minimize the threats to 
the system. In this chapter, concepts of programming 
environment will be presented, the MVC design pattern (been its 
importance explained) and finally, the PHP language will 
receive attention. 

Programmers have contributed greatly to design patterns for 
having realized that certain solutions for programming would be 
interesting for other developers, leading to a greater flexibility, 
organization and code efficiency. These patterns can be used in 
more than one programming language and have become basic 
requirements in large development companies, including the to 
the Web and to the Information Security. 

The full name of the MVC pattern is Model-View-Controller 
and each of these words is a development layer, respectively, 
Model - Visualization - Control. 

"MVC is a development concept (paradigm) and design that 
tries to separate an application into three distinct parts. On one 
hand there is the Model which is related to the current job that 
the application manages, on the other hand, there is the View, 
which is related to display data or information on this application 
and there is a third part, Controller, which coordinates the two 
previous parts displaying the correct interface or performing 
some work that the application needs to complete. " (Goncalves, 
Edson - 2007) [2] 

The Model (model layer) represents the application data 
(database) with its tables (relational model) and its definitions, 
such as stored procedures, for example. Those latter ones are 
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procedures stored directly in the database using the DBMS own 
language (Database Management System); its advantages are the 
greatest working speed (as they are already compiled functions 
and in the internal language) and safety, since they cannot be 
performed by any user. 

Developers should not use the "root" account and password 
for page services, especially on the Web. Unfortunately, many 
do not follow this caution and when the software is ready, they 
do not change the database user settings for an appropriate 
account and their limited rights. 

On those applications that are not divided into layers, it 
might happen to exist SQL commands (Structured Query 
Language) that, although not shown in the code ".html" returned 
by the server to the client browser, pose a danger if they fall into 
the hands of hackers entering on the server and have access to 
the page "php" for example. 

To understand the information flow in the navigation, it is 
necessary to explain that everything begins with the page request 
by the final user; when the server is found, it returns back a page 
to the requester, and if there is a dynamic code (program) that is 
accomplished. As an example take the grades and absences 
checking for a particular student (web system user) on a College 
website: when the registration number and password are entered, 
the Web server sends it to the database server concerned and it 
performs the query and returns only the data resulting to the Web 
server (eg Apache), which in turn assembles a page formatting 
these data and returns it to the final user (student), delivering it 
on its browser program (on its IP and proper communication 
port). 

The separation of layers is important for each address their 
specific functions. The view does not need to know what the 
SQL statement executed in the bank and the Model must provide 
the data requested without any of its authority to presentation 
and formatting them. 

Been the flow of information in a Web request understood, 
it is possible to explain what the View layer (view) is, which is 
the user data presentation layer. The separation of layers is 
important so that each one take care of its specific functions. The 
visualization does not need to know which SQL command was 
executed in the database and the Model must provide the data 
requested, no matter if it is its competence the presentation and 
formatting them. 

Not only for safety reasons but also to make codes more 
readable and possible to be developed separately; in a company 
there may be a page design sector, independent of programming 
sector (PHP, Java, .Net). So employees can become more 
efficient, been their processes more specific (each one in their 
area). 

To control the information procedures between the Model 
and the Visualization there is the Controller layer (Control); the 
user's request made through the web site (visualized by 
customer) needs to communicate/change/query the database 
(model) and the Control will determine how this will be done 
and how the information will be addressed before and after the 
database been contacted ("before" to check security and business 
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rules for example, and "after" to send them to the final user in an 
appropriate format in the Visualization). 

As stated earlier, this solution can be implemented in PHP 
language which is a combination used by developers seeking, 
among other advantages the characteristics gainful of the 
paradigm of Object Oriented Programming. 

This language can be worked in the Structured paradigm as 
well as in the Object Oriented. The characteristics of the past few 
ones are: encapsulation, inheritance, polymorphism, 
composition and the use of so-called "interface." 

According to Niederauer, Juliano (2005) [3]: "PHP is a 
language dedicated to the Web, so there must be a Web server 
which receives the requests of pages, do the processing through 
PHP returning to the browser (browser) a result." 

As using APACHE or MySQL (combined with PHP) both 
settings require configurations made in the server through text 
lines in configuration files from each of them. When an Internet 
provider is hired for hosting services, the programming language 
must be specified as well as the page server program and the 
database used in the preparation of the site in question. 



III. VULNERABILITIES TO BE OB SERVED ON 
PREPARING "SITES" 
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• A6: Security Misconfiguration 

• A7: Insecure Cryptographic Storage 

• A8: Failure to Restrict URL Access 

• A9: Insufficient Transport Layer Protection 

• A 10: Unvalidated Redirects and Forwards 



Within these ten, the "closed padlock" is the A9 item. There 
are nine other large groups of vulnerabilities to which a site may 
be subject. Nowadays, there are some standards and best 
practices for building web sites with the intention of letting them 
resistant to vulnerabilities and threats that plague applications of 
this nature. 

This project (OWASP - Top 10) [4] has as its aims to create 
awareness about safety in applications by identifying some of 
the most critical risks that haunt organizations. 

Attackers can potentially use different routes through an 
application to damage the business of an organization. Each one 
of these routes is a risk that may or may not be sufficiently 
serious to receive attention. 



The following question is plausible for this paper: What is a 
secure site? A large proportion of people would answer that it is 
a site where there is no risk of losing money. In other words, if 
it is a shopping site which really sends exactly the product asked; 
if it is a bank via Internet which no one can perform operations 
in the account or cause injury. Others, more informed, would say 
that those are sites with a padlock at the bottom of the browser, 
but the lock is one but not the only way for security. 

The padlock which is shown in the browser means that the 
communication channel between the browser and the site is 
secure against interception. An intermediary can even clip the 
line, but since the data is transmitted in code, he cannot 
understand them. Once clipping the line, an attacker cannot get 
to know the account number, or password, or that exact pages 
are visiting. One would choose sites on which the lock is shown 
and avoid sites where they do not appear, especially in the case 
of financial transactions and shopping sites. It may seem enough, 
but there are several other threats and vulnerabilities which 
could be used as means of compromising the security of a web 
site in many different aspects. The OWASP project (The Open 
Web Application Security Project) describes what is considered 
the ten most common types of technical vulnerabilities in web 
systems: 

• Al: Injection 

• A2: Cross-Site Scripting (XSS) 

• A3 : Broken Authentication and Session Management 

• A4: Insecure Direct Object References 

• A5: Cross-Site Request Forgery (CSRF) 
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Figure 1 - Attacks on Web Applications Source: 
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Font: https ://www.o wasp.org/index.php/Top_l 0_20 1 0- 
Main, viewed at: 05/08/2015. 

In this paper will be explained and exemplified solutions in 
PHP language for the first two vulnerabilities including source 
code display and interpretation to the reader. 

Al - INJECTION 

Injection flaws occur when non trusted data is sent to an 
interpreter as part of a command or query. The attacker's hostile 
data can trick the interpreter and executing unintended 
commands or accessing unauthorized data. 

When the site requests an ID from the user, they can enter 
malicious code to gain unauthorized access to sensitive 
information. The source code can be written, for example: 

$ Query = "SELECT * FROM customers WHERE username 
= '$user_name'"; 

In this example (in PHP), the information in the table 
"customers" will be selected for a particular user that on a 
previous moment of the application, must have informed the 
correct password and thus obtained access to the system. 

However, for an attacker, even not having the correct 
password, it is possible to exploit a weak point that less 
experienced programmers leave in the system: enter part of a 
SQL code to fraud security. 

Instead of typing a name the hacker types "'or 1"'. 

Opening and closing quotation marks with nothing inside, 
the code does not inform the user name as expected. The next 
step is to make the site in PHP run "or 1" meaning 'or 1'. In the 
truth table, when an expression has two logical operators 
connected by "or", it is sufficient that one of them is true for the 
entire expression to become true, returning "true". 

Then when the SQL "Select * from customer where..." it gets 
the true in the where clause, it returns the data of the referred 
table to the attacker. 



It is also true when a password is requested, since the 
malicious code described herein can work with a SQL whose 
where has more parameters such as: "Select * from customers 
where username = '$user_name' and password = '$password'." 

Another danger is the hacker to delete the table records, if he 
types:'"; DELETE FROM customers WHERE 1 or username = 
"'. As shown in this example, the "where" clause would also 
return "true" (true) for all records in order to erase them by the 
delete command. 

The solution in both cases is different for Java and PHP. In 
Java, it is recommended the use of PreparedStatement object 
with a question mark in the SQL command in the space of the 
data: 

"Select * from customers where username = ? and password 
=? " 

And then identify each question item separately 
(objeto.setString (name) and objeto.setString (password)). 



Now, in PHP it is necessary to create a function that 
eliminates this possibility, either by SQL command like "from, 
alter table, select, insert, delete, update, where, drop table, show 
tables," or turning them in a string (text type) that cannot be 
performed. In this transformation commands are recorded in the 
database as plain text (in columns name and password). 

That function is described in the following lines of the source 
code (taken from the site 

http://www.htmlstaff.org/ver.php?id=l 8528, accessed 
03/07/2012). 

1. <? 

2. function anti_injection ($field, $adicionaB arras = false) 

3. { 

4. // remove words that contain syntax sql 
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5. $field = preg_replace ("/ (from I alter table I select I insert 

6. I delete I update I 

7. where I drop table I tables show I # I \ * I - 1 WW) / i" "", 

8. $field); 

9. $field = trim ($field); // clean empty spaces 

10. $field = strip_tags ($field); // strip html tags and php 

11. if ($adicionaB arras II ! get_magic_quotes_gpc ()) 

12. $field = addslashes ($field); // Add slashes a string 
13 return $field; 

14. } 

15. ?> 
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} 



13. 
14. 
15. 
16. 

17. } 

18. ?> 



else { 

$ref=$ref.".php" ; 

} 



For this example, if the URL you entered does not bring a 
parameter with existing file name in the Fatec server, it will 
assume the value index.php thus showing the home page of the 
Site. 

In case the page is found on the server, then it will be 
displayed immediately. 



A2 - CROSS-SITE SCRIPTING (XSS) (COMMANDS 
SEQUENCE IN CROSSED SITES) 

Failures XSS occur each time an application takes non 
trusted data and sends it to the web browser without validation 
and proper coding. XSS allows attackers to execute a command 
sequence in the victim's browser which can hijack user sessions, 
destroy web sites or direct the user to a malicious site, for 
example (fictitious): 
http://www.meusite.com?page=http:///www.sitehacker. com. 

At this point there was an internal parameter of the site called 
page, whose content (which should show a page of MySite 
domain) that was counterfeited to show the hacker site. 

Once accessed, the malicious site can execute commands 
directly on the server which is hosting the site being "attacked." 

The solution is to check each parameter as requested (on the 
page address, which appears after the question mark and is 
separated by "&") to identify before running if it is reliable. 

On the website of Fatec Ourinhos 
(http://www.fatecou.edu.br) this precaution was taken with the 
following code: 

I. <? php 
2. 

3. $ref = $_GET['content']; 

4. if ($ref="") $ref= "index"; 
5$gets = split("\.",$ref); 

6. if (count ($gets)> 1) { 

7. $ref=$gets[0]; 

8. for ($i = 1; $i < count($gets); $i++){ 

9. eval ("\$var".$i."= V ".$gets[$ i]." \ ";"); 

10. } 

II. if (!is_file($ref. ".php")) { 
12. $ref = "index.php"; 



IV. CONCLUDING REMARKS 



Through researches for preparing this work and from 
personal experience on developing Web sites it is possible to 
conclude that it is very important to take proper precaution and 
use efficiently the tools to create and maintain a secure 
environment in computer networks. 

It is be seen that the attacks might come from the World 
Wide Web or even from the companies Intranet. The network 
environment enables collaboration and significant results and 
currently indispensable to production and business, for example 
industries and service providers. 

Defense tools might be free or owner software and less 
experienced developers are at high risk if they do not known 
them. 

Their education must be extensive since caring for the safety 
range from the pages of server configuration, database server, 
the choice of programming languages with better resources and 
less vulnerabilities and even in the workplace (with appropriate 
practices which will avoid for example Social Engineering 
attacks). 

On concluding, it is essential to point out that after making 
these choices, it is necessary to improve the programming 
techniques, seeking to avoid breaches in the source code of 
pages which constitute websites, especially those of vital 
importance in the dynamics of the operation of enterprises, no 
matter if they are internal or in competition and/or collaboration 
with others in a globalized environment. 

V. ANNEX I 



2 . SimpleWebServer .j ava 

3. This toy web server is used to illustrate security 
vulnerabilities. 

4. This web server only supports extremely simple HTTP 
GET requests. 
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4 1 . String command = null; 



5. This file is also 
http://www.learnsecurity.com/ntk 



available 



at 



6. / : 



/ 



7. package com.learnsecurity; 

8. import java.io.*; 

9. importjava.net.*; 

10. import java.util.*; 

11. public class SimpleWebServer { 

12. /* Run the HTTP server on this TCP port. */ 

13. private static final int PORT = 8080; 

14. /* The socket used to process incoming connections 

15. from web clients */ 

16. private static ServerSocket dServerSocket; 

17. public SimpleWebServer () throws Exception { 

18. dServerSocket = new ServerSocket (PORT); 

19. } 

20. public void run() throws Exception { 

21. while (true) { 

22. /* wait for a connection from a client */ 

23. Socket s = dServerSocket. accept(); 

24. /* then process the client's request */ 

25. processRequest(s); 

26. } 

27. } 

28. /* Reads the HTTP request from the client, and 

29. responds with the file the user requested or 

30. a HTTP error code. */ 

31. public void processRequest(Socket s) throws 
Exception { 

32. /* used to read data from the client */ 

33. B uf f eredReader br = 

34. new Buff eredReader ( 

35. new InputStreamReader (s.getInputStream())); 

36. /* used to write data to the client */ 
3 7 . Outputs treamWriter os w = 

38. new OutputStreamWriter (s.getOutputStream()); 

39. /* read the HTTP request from the client */ 

40. String request = br.readLineQ; 



42. String pathname = null; 

43. /* parse the HTTP request */ 

44. StringTokenizer st = 

45. new StringTokenizer (request, " "); 

46. command = st.nextToken(); 

47. pathname = st.nextToken(); 

48. if (command.equals("GET")) { 

49. /* if the request is a GET 

50. try to respond with the file 

51. the user is requesting */ 

52. serveFile (osw,pathname); 

53. } 

54. else { 

55. /* if the request is a NOT a GET, 

56. return an error saying this server 

57. does not implement the requested command */ 

58. osw. write ("HTTP/1.0 501 Not Implemented\n\n"); 

59. } 

60. /* close the connection to the client */ 

61. osw.close(); 

62. } 

63. public void serveFile (OutputStreamWriter osw, 

64. String pathname) throws Exception { 

65 . FileReader fr=null ; 

66. intc=-l; 

67. StringBuffer sb = new StringBuffer(); 

68. /* remove the initial slash at the beginning 

69. of the pathname in the request */ 

70. if (pathname.charAt(0)==V') 

7 1 . pathname=pathname. substring( 1 ) ; 

72. /* if there was no filename specified by the 

73. client, serve the "index.html" file */ 

74. if (pathname.equalsC'")) 

75. pathname="index.html" ; 

76. /* try to open file specified by pathname */ 

77. try{ 

78. fr = new FileReader (pathname); 

79. c = fr.readQ; 
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80. 
81. 
82. 
83. 
84. 
85. 
86. 
87. 



90. 
91. 
92. 
93. 
94. 
95. 
96. 
97. 
98. 
99. 
100. 



catch (Exception e) { 
/* if the file is not found, return the 
appropriate HTTP response code */ 
osw. write ("HTTP/1 .0 404 Not Found\n\n"); 
return; 
} 

/* if the requested file can be successfully opened 

and read, then return an OK response code and 

send the contents of the file */ 

osw. write ("HTTP/1.0 200 OK\n\n"); 

while (c != -1) { 

sb.append((char)c); 

c = fr.read(); 

} 

osw. write (sb.toStringO); 
} 

/* This method is called when the program is run from 
the command line. */ 

public static void main (String argv[]) throws 
Exception { 
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[4] The Open Web Application Security Project (OWASP) is a worldwide 
not-for-profit charitable organization focused on improving the security 
of software. Their mission is to make software security visible, so that 
individuals and organizations worldwide can make informed decisions 
about true software security risks. 



101. /* Create a SimpleWebServer object, and run it */ 

102. SimpleWebServer sws = new SimpleWebServer(); 

103. sws.run(); 

104. } 

105. } 
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Abstract — Although the basic application of Information and 
Communication Technologies (ICT) in the Tanzanian health care 
systems started years ago, still fragmentation of Information 
Systems (IS) and limited interoperability remain to be big 
challenges. In this paper, we present an analysis done on the 
present health care delivery service, HIS and on some of existing 
eHealth solutions focusing on interoperability and collaboration. 
Through interviews, questionnaires and analysis on e-health 
implementations in relation to interoperability and collaboration 
we have established that, the lack of standard procedures to guide 
the lifecycle of eHealth systems across the health sector and poor 
willingness to collaboration among health stakeholders are key 
issues which hinders the manifestation of the benefit of ICT use in 
the health sector of Tanzania. Based on the findings, we provide 
some recommendations with a view to improve interoperability 
and collaboration. 

Keywords: eHealth; healthcare; eHealth adoption; 
interoperability. 

I. Introduction 

It is widely accepted that the application of information and 
communication technologies (ICT) in health has enhanced 
provision of health services across the world [1], [2], [3]. The 
World Health Organization defines eHealth as "the cost- 
effective and secure use of information and communications 
technologies (ICT) in support of health and health- related fields, 
including health-care services, health surveillance, health 
literature , and health education, knowledge and research" 
[4] .Regardless of its importance the adoption of eHealth 
standards in many African countries is still a challenge [5], [6], 
[7]. Tanzania, like other many African countries, its health care 
system has been facing almost similar problems [8], [9]. Due to 
the need of good health care delivery services in the society, 
these problems cannot be avoided and will require fundamental 
changes in the current health care arrangements [10]. Tanzanian 
government through the ministry in charge of health sector and 
social welfare (MoHSW) has developed its strategic plan called 
the Health Sector Strategic Plan III to guide priority setting and 
deployment of resources in the health sector [11]. The already 
initiated Tanzania National eHealth Strategy (2013 - 2018) of 
the health care system aims to integrate all fragmented 



information systems (IS) and offer a complete solution that will 
benefit all interested parties. To achieve this the issue of eHealth 
standards, systems interoperability and collaboration between 
different eHealth stakeholders must be given a serious 
consideration. Taking into account that it is within achieving 
systems interoperability, agreement on the data standards to be 
used must be reached. This results to efficient collaboration 
among different eHealth stakeholders in accomplishing a 
number of goals like the improvement of the quality of patient 
care, reduction of medical errors, and therefore savings in terms 
both of human and financial costs [5]. A recent study by 
Lawrence explains the issues, challenges and opportunities 
towards EHR interoperability in Tanzania hospitals; the main 
concerns were privacy, security and confidentiality issues when 
considering information sharing and data sharing [12]. Hence it 
was important to know how far we are in eHealth standards 
adoption, systems interoperability and collaboration among 
eHealth stakeholders in our health sectors. 

However, a well formation of the Tanzania health care system 
should provide opportunities for high quality and professional 
work with patients and long-term development, whereas 
relevant and reliable economic, administrative and medical data 
provided by eHealth should facilitate better quality planning, 
control and management of individual health care organizations 
and health care system in general. The focused question 
answered in this research is: what is currently existing in the 
Tanzania eHealth landscape? The main objectives of the paper 
is on analysis of activities and operations in the current eHealth 
landscape in Tanzania focusing on systems interoperability and 
collaboration between eHealth stakeholders. After the 
introduction, the second section of the paper presents the 
healthcare system of Tanzania's mainland where we see the 
challenges in adapting eHealth standards in Tanzania. Our study 
and methodology is in the third section. Fourth section outlines 
analysis where findings deduced from analysis of activities and 
operations in the current e-health landscape in Tanzania is 
presented. Section five provides discussion. The last section is 
conclusion and recommendation where we provide some 
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recommendations for more effective further development and 
implementation of eHealth in Tanzania. 

II. HEALTHCARE SYSTEM OF TANZANIA' S 
MAINLAND 

Tanzanian mainland health infrastructure and healthcare 
services are categorized into four levels; primary level (village 
health posts, dispensaries, and health centers) to district 
hospitals, regional hospitals and finally, consultant /specialized 
hospitals [13]. About 90% of the population live within five 
kilometers of a primary health facility [10]. The first line care in 
rural areas is provided by Clinical Officers with 3 years of 
medical training or Assistant Medical Officers with additional 2 
years medical training [14]. The introduction of mandatory 
health-insurance schemes for formal-sector employees, offering 
comprehensive health care benefits to their members, the largest 
being the National Health Insurance Fund covers civil servants, 
and on the other hand The National Social Security Fund for 
private formal-sector employees [15]. 

A. Challenges in Adapting E-Health in Tanzania 

While the integration of ICT and healthcare has brought a lot of 
potential benefits, there are many challenges which affect its 
adoption in Tanzania. Different studies show that inadequate 
ICT infrastructure, unreliable electric power, low ICT budgets, 
Lack of coordination on ICT matters among ministries, 
departments, and agencies (MDAs), as well as partners, poor e- 
healthcare systems design, inadequate ICT skills on the 
healthcare workers to mention few, are the bottlenecks to the 
adoption of eHealth in Tanzania [8], [9]. As stated in the action 
plan report by the Ministry of Health and Social Welfare 
[10] current challenges to eHealth in Tanzania includes: 

• A fragmented landscape of eHealth pilot 
projects and stakeholders 

• Numerous data and health information 
systems (HIS) silos 

• Lack of ICT infrastructure 

• Lack of ICT workers, in particular those who 
are well trained 

• Lack of coordination on ICT matters among 
ministries, departments, agencies (MDAs), 
and the lack of an architecture to guide the 
development of HIS bottlenecks. 

• Lack of compliance with eHealth standards 
and systems interoperability 

With these challenges the analysis of activities and operations 
in the current e-health landscape in Tanzania was inevitable. 

III. OUR STUDY AND METHODOLOGY 

A. Area of Study 

This study was carried out in Dar es Salaam and Arusha, 
Tanzania. We consider more Dar es Salaam since it has more 
healthcare facilities as well as key informants from health care 
workers, preferably supervisors or staff in-charge in health 
institutions [16] .The analysis on HIS was carried out in 



hospitals, dispensaries (health institutions) and some company 
that are involved themselves in developing health management 
systems. 

B. Sampling and Data Collection 

A cross-sectional study was deployed in eight hospitals, seven 
dispensaries and some company that are involved themselves in 
developing health management systems. Data collection 
included the use of structured questionnaires and interviews. 
Data was collected in order to analyze the current activities and 
operation in eHealth. Guided questionnaires were used to 
measure the intensity and strength of the factors associated with 
the current activities and operation in eHealth. Review of 
existing documents such as journal articles and official reports 
related to the topic under study was done. 

C Data Analysis 

Statistical Package for Social Sciences (SPSS) was used for data 
analysis. We present the findings in tables for easy readability 
and interpretation of data. The significance was tested using a p- 
value of p = 0.05 with a confidence interval of 95%. 

IV. ANALYSIS / RESEARCH FINDINGS 

The analysis done on current health care delivery service, 
applicability of eHealth components and on some of existing 
eHealth solutions and systems focusing on collaboration and 
system interoperability, resulted into key findings that are 
presented in category wise as follows: 

A. Existing eHealth Solution and Health Information Systems 

Health service, particularly when considering eHealth (a case of 
applications and systems) involves several tasks (Reporting, 
collection, management, knowledge transfer or analysis of data 
to mention a few). Our examination, reveals the existence of 
various systems that are concentrating on collection, 
management and analysis of data, but which are not 
interconnected and inter-operable. 

AllseeEHR system which is implemented in government 
hospitals in Kinondoni Municipal in Dar es Salaam, it is more 
about recording of patient information on reporting, but the 
emphasis is more on recording cash flow from different sectors, 
although patient history can be viewed once he/she provides 
his/her registered id but also it is neither inter-operable nor 
interconnected among the implementing partners. On the other 
hand, some open source software like OpenMRS and Care2x 
have been implemented in some areas for various purpose like 
management of HIV/AIDS, and for registration. LIS, JIVA, 
LMIS, DHIS2 and CTC2 are present health systems that are 
implemented in various health institutions but, they are not 
interoperable or interconnected either. 
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B. Distribution ofeHealth Services between Rural and Urban 
Areas 

In recent years there has been an increase in the number of health 
facilities in the country, so that the majority of the population 
lives within 5 km from a health facility. However, there are 
still geographical inequalities in access to health services 
[16]. 

In relation to geographical inequalities in access to health 
services (Between rural and urban areas) there is relatively 
higher support from various stakeholders in urban areas than 
in the rural areas [8] . This is in line with our findings where 
a number of health stakeholders prefer to settle their business 
in urban due to infrastructure problems present in rural areas. 
This increases the gap we see in access to health services and 
eHealth applicability between these two areas. As this 
stands, there is less effort by the government or other 
stakeholders in health to resolve the situation the challenge 
being inadequate resources. 

C. Collaboration Among eHealth Stakeholders 

Health sector involves a number of stakeholders covering from 
government, public/user, policy maker, healthcare professionals, 
Funders etc, who may be categorized differently. In this study, 
especially when considering collaboration among eHealth 
stakeholders, we have presented four categories as: developer, 
implementers, clinicians (Health care provider) and users. 

The study revealed that collaboration among the mentioned 
categories do exist, however the lack of a standardized way 
(agreed upon tool) for collaboration among the eHealth 
stakeholders was found to be a big challenge. The result of chi- 
square test shows that collaboration among eHealth stakeholders 
level is significant (p = 0.026). Also we found out that, there is 
poor willingness towards collaboration among private 
companies or vendors who are involved themselves with 
developing of health management systems (when considering 
developers). Some of the reasons to this are due to business 
issues, and there is no initiative so far trying to call those 
companies together so that they can seat and reach an agreement 
on how to collaborate, tools for achieving such collaboration, 
business issues and policy to guide them in their collaboration. 
This would help to solve the two prior challenges. On that 
perception we asked the stakeholders (participants) about 
collaboration and tools used in achieving such collaboration. 
The results were as follows: 



Table 1: Stakeholder's response towards collaboration and 
tools used in achieving such collaboration (N=102) 



Category 


Interviewed 
stakeholders 


Number of 
collaborating 
stakeholders 


Tools used 
(Per percentage % representation) 


Total 

% 








Phone 


Email 


Phone 
and 
Email 


Git/ 
any 
CVS 




Developers 


16 


13 


43.75 


25 




12.5 


81.75 


Implementers 


7 


7 


71.43 


28.57 






100 


Clinicians 


22 


22 


54.55 


45.45 






100 


Users 


57 


51 


80.7 


8.77 






89.47 



Unreliability of the internet in most of the hospitals regardless of 
the presence of National ICT Broadband Backbone (NICTBB), 
results in information exchange by using emails to be less 
preferred compared to use of mobile phones. Looking into 
another angle, collaboration among private hospitals or private 
to government was found poor, that is the willingness of those 
parts to collaborate is poor. Some argued that they are doing 
business in which they compete thus it is difficult to collaborate 
with your competitor; nevertheless we present the view of health 
stakeholders on how collaboration is in their respective 
organizations. 





Frequency 


Percent 


Satisfy 


42 


41.2 


Poor 


36 


35.3 


Valid 






Normal 


24 


23.5 


Total 


102 


100.0 



Table 2: Health stakeholder views on collaboration in their 
respective organization (N=102). 



D. System Interoperability 

As stated in a report by the Ministry of Health and Social 
Welfare (2013-2018) [10] that: Tanzania's HIS are faced with 
system interoperability problems. We found out that almost 
62.5% of complexity in data integration and hence 
interoperability were in line with our hypothesis that 
"Interoperability fails because of lack of coordination at all 
levels of systems development. A well designed collaboration 
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architecture will facilitate coordination which, in turn, will lead 
to interoperable systems development". Also, the common use 
of open source systems which were not specifically designed 
according to our context and environment (Varying in health 
culture) being a source of fragmentation and lack of 
interoperability. 

Lack of compulsory governance structure and standards to guide 
the development of eHealth systems across the health sector (an 
architecture, Security and Data dictionary) top up to 
interoperability problem [10] .With this remark, we observe 
different systems with different design and data structure which 
also add to system interoperability problem. Although the 
creation of a common data warehouse through integration of the 
diverse information systems into DHIS2 which deals with more 
data collection and analysis processes is the current focus, the 
awareness of interoperability and data standard adoption is still 
low among the health and ICT workers. As 55.9% of interviews 
personnel when asked about these two parallel things, their 
response was poor and the result of chi-square test shows that 
system interoperability is significant with p =0.004. 

V. DISCUSSION 

This study reveals that the current eHealth activities in Tanzania 
mainland's are still faced with a lot of challenges involving 
systems interoperability and collaboration among eHealth 
stakeholders. Although there is an eHealth policy to direct what 
to be done and how, the situation is quite different in most health 
centers and hospitals. In most cases the reasons being inadequate 
ICT infrastructures, inadequate resources, poor ICT skills 
among health workers and budget limitation. These findings 
support the findings in previous studies [8], [9]. 

System interoperability is an important aspect towards achieving 
good health care service delivery [5]. As that fact stands, in our 
study, we found out that almost 86.3% of the systems are not 
capable of sharing information (or not interoperable). Several 
factors were recognized that are concerned with this situation, 
the common one being most of the systems are designed as per 
hospital needs and they differ a lot in their data structure or 
formats. However, querying multiple data sets with different 
format requires mediated schema which in turn requires 
scientists to have knowledge of the query syntax [17] that 
awareness to most of our health IT stuff is still low. We also 
found out that security and privacy concerns are associated with 
most of the organizations not willingly to share their data. This 
is in line with [12] who said that "Tanzania health consumers 
should be made comfortable by ensuring that the issue 
surrounding privacy and security of their health records are 
clearly addressed before taking any further step towards the 
implementation of interoperable EHRs for health information 
exchange". In order to deal with interoperability problems a 
common data standard must be agreed upon. "At the most basic 
level, the data standards are about the standardization of data 
elements: (1) defining what to collect, (2) deciding how to 
represent what is collected (by designating data types or 



terminologies), and (3) determining how to encode the data for 
transmission" [18]. So where there is no data standards and data 
quality, interoperability is becoming a big challenge to handle. 

On the other hand, our study reveals that 91.18% of interviewed 
eHealth stakeholders based on the level of collaboration as 
defined in this study are capable of collaborating regardless of 
what tools they are using to achieve such collaboration as shown 
in Table l.When rating the existence of collaborating in their 
respective organization, the results were 41.2% are satisfied, 
35.3% rate poor, 23.5% rate normal, respectively. Looking into 
tools for collaboration, we found out that phones were leading 
with 73% following with phone and emails 16%, emails 9% and 
2% for version control specifically here we considered Git. 
These findings are consistent with some findings of previous 
study when giving an account on the adoption and use of ICT 
by healthcare workers, which report that " Over 93% of the 
health care institutions use mobile phones in this regard " [8]. 

Furthermore, we looked into the defined level of collaboration 
starting with developers from different health organizations (in 
most cases, they are under ICT department), we found out that 
they are aware of the existence of other tools like GIT or other 
versions control systems (VC) for collaboration, but there is no 
applicable tools so far among them for the purpose of 
collaboration due to a number of reasons mostly being issues 
surrounding privacy and security of their health records. This 
agrees with the study done by Ndume which reports that despite 
the existence of several collaboration tools naming them as Ning 
(aimed for network expansion), public library of science 
(knowledge expansion), Epic surveyors (remote functionality), 
Scribed (research promotion) as well as Skype, Wiser, Twitter 
and Facebook, some of the kits don't give researchers peace of 
mind with respect to security, intact and credibility of their work 
[17] .The situation is the same not only to researchers but also to 
other different health stakeholders. About 79% of interviewee 
showed that response, but this is a more traditional way of 
thinking that can be changed with proper knowledge on those 
tools and on how to customize them based on their requirements 
in terms of security and privacy. In the same way we considered 
the level of clinicians(Health care providers) and users, the 
interview with them revealed that they have an awareness 
concerning collaboration even though it is mostly done through 
mobile phones. 

At this point we argued why mobile phones are more involved. 
The answers were obvious. Any member can buy a phone and 
found him or herself in one way or another using it as a tool for 
collaborating with other members in the field. Also, poor or 
inadequate ICT infrastructures in most of the hospitals resulting 
in the use of the mobile phone as a number one tool for 
collaboration. In addition, the "Tanzanian health sector is 
characterized by a fragmented landscape of ICT pilot projects 
and numerous data and health information system (HIS) silos 
with significant barriers to the effective sharing of information 
between healthcare participants" [10]. Hence it is clear that we 
have the problem of system interoperability and it was observed 
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during the study that in some cases collaboration out of using 
mobile phones or emails as a tool for achieving collaboration ,it 
was hindered simply because systems were not capable of 
sharing information and poor willingness towards 
collaboration among different health stakeholders due to some 
reasons revealed in this study. But regardless of several 
challenges under this area, the need of collaboration and 
connection of a widespread network of stakeholders within the 
health care system and between the different health 
stakeholders level is important, as it was reported in [19] that 
"Realization of health care sector goals of the vision 2025 needs 
collaboration of all the key stakeholders involved in health". 

This calls for proper technology improvements, especially 
when dealing with interoperability, collaboration, security and 
privacy issues, as health data information is highly sensitive and 
different health organizations have their own orientation, rules 
and policy. Although agreement on a mechanism for ensuring 
privacy and security of their health records, technological 
means and policy to be used may be reached, we must take into 
account that collaboration is something that cannot be forced 
but can be agreed upon. 

VI. CONCLUSION AND RECOMMENDATION 

In this paper we report about an analysis of current operation 
and activities in Tanzania mainland's eHealth landscape 
focusing on interoperability and collaboration. Taking into 
account that analysis of activities and operation in eHealth 
landscape is an ongoing activity that needs time and resources, 
we selected key areas and features in order to meet the 
objectives and the reality of the situation on the field which was 
very important in this study. We found that it is important that 
the introduction of ICT curriculum or ICT training sessions 
targeting eHealth in health training institutions to health 
workers has to be considered. By doing this the awareness and 
effectiveness use of ICT among the health staffs will increase 
and facilitate its adoption by leveraging the presence of 
National ICT Broadband Backbone (NICTBB). There is a need 
for more effort by the government through the ministry in 
charge of the health sector towards collaboration by promoting 
this tradition among different health stakeholders. Also, 
different seminars regarding interoperability issues are to be 
organized aiming at increasing its IT literacy among health 
professions. On the other hand, inadequate support, budget 
limitation, security concerns and unreliable power supply were 
found to be the most common challenges facing the eHealth 
activities, a proper attention must be given to these challenges. 
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Abstract: Biometric system is an analysis of unique 
biological features of human being. The purpose is used for 
human security and identification. Different conventional 
biometric (such as face recognition, iris, fingerprint, etc.) 
methods are used for security and identification purpose, 
but they can capture only by physical control or at a close 
distance from record search. Gait on a behavioral biometric 
has attracted more attention recently because it can capture 
at a distance with requiring the earlier consent of the 
observed object. This survey paper covers the current 
trends and method of Gait based surveillance system using 
triangle methods and compare them. 

Keywords: Biometric, Gait Recognition, Image 
Processing, Triangle methods, Pattern Recognition. 

1. INTRODUCTION 

As the world is getting advanced and 
computerized so the security system which were earlier 
human controlled [1] are being replaced by a computerized 
surveillance system. Which is based on image processing. 
It is used in this system to identify the unique physical 
property that means Biometric property of a person, 
Biomatric characterized into two portion physiological 
properties (face, fingerprint, iris, DNA) and behavioral 
property (signature, voice, walking pattern). 

Previously, biometric research concentrated on 
human authentication and authorization, utilizations face 
images, fingerprint, palm prints, shoe print, iris, images and 
handwriting. But these conventional biometric resources 
suffer from several limitations such as distance between the 
camera /scanner and people, people (user) co-operation will 
consider for authentication and authorization. 

For visual surveillance applications, the 
conventional biometrics resources are difficult to utilize 
and gait provides in an interesting way. A gait describes the 
manner of a person's walking i.e. walking pattern 
recognition. It can be acquired at a same distance and it is 
necessary without the walker's co-operation or knowledge 



that's why this method represents as a further security 
system. One of the methods which are used to make such 
examination is gait. It can be done by Genetic Algorithms 
(GA), Artificial Neural Network (ANN), and mathematical 
concepts (geometric) by using Gabor system. In a previous 
study [2] the body is divided into two parts the static or fix 
(upper part) and dynamic or more movable then upper part 
(Lower part). The upper body part is subdivided into three 
parts, the first part is the head, the second part is arm the 
third part is the chest and the lower body part is subdivided 
into 4 parts, the first part is thigh, which includes hip, 
second part is the front leg, the third part is back leg and the 
fourth part is feet. Gait has mainly worked done in the 
lower body part because the lower portion of the body 
moves more than upper part so study of moving parts is 
easy. The front-leg and back-leg are included as separate 
parts because of the bipedal (cycle) walking style.When a 
person walks the left leg and the right leg come to 
front/back by turns and create a cycle. 

This survey paper is divided into five sections, one 
contains an introduction, the second contains an overview 
of biometric recognition system, section three contains 
literature contents studied, section four contains 
comparisons of the triangle techniques based on studied in 
literature survey and section five contains the conclusion. 

2. OVERVIEW 

An informative survey of the current analysis 
techniques to data regarding human movement has been 
outlined by Gavrlla [3], In his work has done visual 
analysis, looking at gestures and whole body movement. 
His survey gives results to recognize human and their 
activities by computer to interact intelligently and 
effortlessly with a human inhabited environment. 

In this basic biometrics surveillance system has following 
component: 

Capture video: The video is captured by high quality 
digital cameras. 

Convert into frames: Videos are converted into various 
gait frames in one cycle, according view here side view. 
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Analysis of each frame based on the approach: Each 
frame is analyzed depends upon the method which is used. 

Correlate as a Triangle Feature: In this component new 
data are stored in the database and if existing values are 
found, then result is decided. 

Correlate as a Triangle Feature: In this component new 
data are stored in the database and if existing values are 
found, then result is decided. 

Database: Here the data are stored. 
Result: Based on input, output is generated. 



biped patterns, the gait of each one is different in relative 
time, step range and so on. So gait is believed to be 
particular for everyone and can be used as a feature in 
personal identity recognition, especially in the distance. In 
this paper they used simple gait recognition method which 
is based on different points on body joints. In this work 
they first extract the human or subject frames from of 
moving bodies in the form of silhouetted images from a 
given image. Silhouetted means the black or carbon images 
contain only black image [4]. In that image, 12 different 
body points identified by them and compute 9 different 
angles between those points. The angles are angles between 
trunk called the trunk angle, angle of left arm, angle of 
right arm, angle of left forearm, angle of right forearm, 
angle of left thigh, angle of right thigh, angle of left shank 
and finally angle of right shank. 



CAPTURE 
VIDEO 



CONVERT INTO 
FRAMES 



ANALYSIS OF EACH FRAME BASED 
ON APPROACH 



CORRELATE AS 
A TRIANGLE 
FEATURE 



DATABASE 



RESULT 



Fig.l Basic biometrics surveillance system for 
triangle approach 



3. LITERATURE SURVEY 

In this section we are discussing an approach 
with different fields. Positioning body joints based 
approach, Angels based gait detection approach, area of 
triangle based approach, A Novel Method of Gait based 
recognition Using Fuzzy Inference, System Gait Geometric 
Characteristic and Fuzzy Logic based approach. 

3.1 Position joint base human body detection: In [4] 

gait recognition means identify individual persons or 
subject by analysis of patterns generated in each frame of 
cycles. Gait recognition is to identify individuals by the 
way people walk in no consideration of the disturbance 
such as background, clothes and so on. In the view of 
biomechanics, the walking of people includes the 
synchronous movements of hundreds of muscles and joints. 
Gait is completely determined by the structure of muscle 
bones. By which all people's movements are based on 



Then they calculated limb angle. After that, 
made discrete Fourier transform for each cycle. Two 
different frequencies, amplitude, frequency and phase 
frequency of angles are chosen. Finally, apply the nearest 
neighbor classifier that is used to classify subjects from the 
database. In their work they used "SOTON" Dataset for 
simulates their results. The SOTON data set has 118 total 
images. From those images in their work they used 10 
images and give Correct Classification Rate (CCR) 78%, 
which had better results than other methods which were 
presented that time like Body shape and template 
correlation (CMU) which correct classification rate 
was 45%, Static body parameters (Georgia 
Techniques) which CCR was 73%. 

3.2 Angels based gait detection: In [5] gait 
recognition angle based gait detection is important and 
more efficient than other method. In this work they used 
two body part of the human as a feature extraction and 
according to those features of the human body the 
calculation is done. They had taken three lower parts of the 
body those all features were from lower because maximum 
movement is done in lower portion. Here in their work they 
took both the foot (left feet and right feet) to be more 
specific they used center point of feet from base as a third 
party hand which is visible in side view are taken and 
construct a logical angle. Forgiven angle gait recognition 
has been considered two features of the human body that is 
hand and feet for gait recognition is considered. To be more 
specific center point of base of both feet is taken as vertices 
of the triangle which will be found using the hand as a third 
vertex. They calculate [5] the formed angle by the slope 
method in that method they used tangent formula. They 
calculated three angles for each frame and after completion 
of one cycle mean value were calculated. 

A cycle is a formed one when a person whose 
walking [11] posture is being captured reaches to the 
posture which is same as starting posture of the person. In 
this paper they focus on angle based analysis and appear 
the method on CASIA A database in which side view 
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images are given. They have taken some (17) subject 
images and calculate the angles for each frame for one 
cycle and after calculation there correct classification rate 
was 90% which is more efficient than other methods. 

3.3 An approach for human gait identification 
based on the area of a triangle: In [6] biometric 
system data to be collected and given as a video input. So, 
in the preprocessing initial video is captured, and then 
converted into frames for that particular person. In this 
work they have considered side [10] view of particular 
subjects. Here they have considered three parameters of the 
human body for feature extraction. It is Left hand, right feet 
and left feet. They consider three feature points. The 
feature points are taken as a white dot point which 
represent high resolution. They create a triangle between 
extracted points. They calculated the length of each edge 
total three edges are captured [6] edge 1 (a), edge 2 (b) and 
edge 3 (c) and then for all the frames of one cycle, the 
mean values of edges are calculated and the stores to those 
values in database. 

Here they calculate correct classification rates for 
both analyses. The First is for individual distance 
classification and second is for pair distance correct 
classification. This experiment demonstrated that a feature 
selected of a by pair distance gives better result than 
individual distance. The result shows the correct 
classification of the first method is 66.6% and the correct 
classification of the second method is 82.3%. It has been 
concluded that second analysis they have given better 
classification rate. 

3.4: A Novel Method of Gait Recognition Using 
Fuzzy Inference System: In [4] this work they used 
body joint method. Here five different three body parts 
were taken. The body parts were left feet, right feet and 
hand. Total five feature points were extracting two points in 
left feet (toe and ankle) and two in right feet (again toe and 
ankle). They construct 2 triangles first are between left feet 
toe, hand and right feet toe and, second between the left 
foot ankle, the right foot ankle and hand. Here they 
observed that both triangles were intersecting and two 
intersecting points generate. They computed those 
intersection points by parametric line equation. They 
calculate those points for each frame for each cycle and 
find the mean value for each cycle and stores in the 
database. This experiment is all based on the condition of 
the outdoor gait database environment of various subjects 
using a side view of the walking direction. After applying 
the algorithm on this database the correct classification rate 
is 90%, which result is good as compared to other methods. 



3.5 Gait Recognition with Geometric 
Characteristic and Fuzzy Logic: In [7] the definition 
of Gait is defined as "A particular way one person walks". 
It is a process which is divided into stages [8]. Analysis of 
walking pattern is a gait cycle. The style of walking or gait 
cycle of every person is unique. [9]. Human gait is the 
repeated motion of the body parts. Mostly there is no much 
more changes in head and shoulder motion as compare to 
hand and legs. The repeated motion part of the body forms 
a gait cycle. A Gait cycle or stride is defined as a 
movement when an initial position of a heel comes back 
again. The single gait cycle is further divided into two 
phases: In the proposed method two parameters of human 
bodies have been taken. The First component is hand and 
another component is feet. The Second parameter is 
subdivided into two portions it is toe (left and right feet) 
and (left and right feet). Total five extraction points were 
identified. The extraction points were decided by high 
resolution white points. Here two triangles were formed 
between these five points and those triangles constructed 
between the toe of left feet, hand and toe of right feet and 
heel of left leg, hand and heel of right leg. Here two 
intersection points were taken [7] for study and points are 
known as I and I' . 

The triangle is constructed where point A 
represent Hand and point B, C represents toe of the left 
feet and toe right feet respectively, point D, C represents 
heel of left feet and heel right feet respectively. In this work 
they calculated the intersection points for each frame and 
then calculate for a complete cycle. Then mean values were 
calculated. These mean values were input of fuzzy 
inference systems. FIS compare and produces results with 
the database values according to the following fuzzy rules: 
Result analysis is done on CASIA dataset for gait 
recognition of the proposed method. In a proposed work 17 
subjects had been taken with 23 frames, which complete 
the gait cycle, only one side is considered. 17 subjects of 
MPEG files are converted into JPEG frames, then white 
dots pixels are inputted on RGB frames of an individual 
subjects in a proposed gait system after that, these RGB 
frames are converted into gray scale for further processing, 
with the database value, if the value is greater than 85% 
then matching is excellent, value belongs between 75 to 
85%, then matching is good, value lies between 60 to 75% 
and matching is average, if value is less than 60%, then 
matching is poor, these rules are decided by the fuzzy set 
and the result shows that the correct classification of this 
method is 88%. 



4. COMPARISON 
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After reviewing different papers and work on 
these approaches, the correct classification rate (CCR) 
obtained by different researchers in field of triangle based 
gait can be summarized as following table which results are 
shown in the table on the basis of their CCR rates and 
number of people under surveillance. 



S.No. 


Approach 


No. of 
Persons 
under 
Surveillance 


CCR 


1. 


Angels based gait 
detection 


18 


90% 


2. 


An approach for human 
gait identification based 
on area of triangle 


18 


82% 


3. 


A Novel Method of 
Gait Recognition Using 
Fuzzy Inference 
System 


18 


90% 


4. 


Gait Recognition with 
Geometric 
Characteristic and 
Fuzzy Logic 


15 


88% 



Table 1 : Comparison of Various Approaches 

The table 1 indicates that the result of Method 
based on positioning body joints [4] A Novel Method of 
Gait Recognition Using Fuzzy Inference System and 
Angels based gait detection gives the same result which are 
best result among the above other mentioned methods 
basically these methods used the concept of fuzzy interface 
and angle based recognition respectively. At first, they 
verified the usefulness of the algorithm on the gait database 
established which includes 18 different subjects. 
Furthermore the experiment is all based on the condition of 
side view images. Other methods have also good results, 
but methods for calculating the area or angle change 
results. Here we are giving comparison chart between 
methods and their CCR rate in percentage where CCR is 
correct classification rate. 



CCR 



95 




5. CONCLUSION 

In this review paper, we present a comparison 
between different approaches the techniques which are 
based on triangle of different body parts. In this review 
paper, we discussed only lower body part analysis. Here we 
took those work which is based on common components. 
All techniques used triangle based method they construct 
triangle by feature extraction and then recognition is done 
by various methods. By this review paper, it has been 
observed that the angle based gait detection and novel 
method give the same results. 
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Abstract — Sanskrit literature is unique in its overwhelmingly 
poetic character. The subjects like science, engineering, medicine, 
grammar and law are mostly written in the form of poetry which 
makes them easy to memorize. The Sanskrit poetry, comprised of 
Shloka or Verse, is classified in terms of unique meter or Vrutta. 
Vrutta is the unique pattern formed by the categorization of 
letters as long and short syllables. Depending on the rule based 
Vrutta identification in the verse, the rhythmic enchanting of the 
Shloka is facilitated. This paper discusses the method of 
identification of Vrutta in Sanskrit Shloka and suggests the 
musical notations based on identified Vrutta, for singing the 
Shloka. The designed system "Sangit Vrutta Darshika" can be 
used as a guide to learn the construction of Sanskrit verse. It also 
facilitates the systematic singing of Sanskrit Shloka which has 
applications in areas like Music Therapy. 

Keywords -Grammar, Long syllable, Meter, Metrical 
classification, Short syllable, Natural Language Processing, 
Sanskrit, Shloka, Vrutta. 



I. Introduction 

Of all the discoveries made in the course of human history, 
language has been the most significant. Without language, 
civilization could not have been progressed. Languages have 
been used as a means of communication since ancient times. 
Presently, the basic language structure has found a new horizon 
of machine-communication in the form of modern computer 
programming languages. Since digital computer is the only 
machine which requires some form of language construct for its 
efficient operation, Computational Linguistics, which deals 
with typical characteristics of such constructs, is a rapidly 
developing field. This scientific outlook at various language 
structures, led to the recognition of importance of Sanskrit by 
scientists world- wide. Sanskrit is one of the oldest and living 
languages on our planet. Research organizations like NASA 
have been looking at Sanskrit as a possible computer language 
[1]. Sanskrit is the systematized language of rich classical 
literature and its alphabets are impeccably arranged, easy to 
remember. The grammar and syntax of Sanskrit language are 
perfect, leaving little room for error. Sanskrit is the most 
efficient natural language for certain computer applications 



development. The well-knit (i. e. syntactically and semantically 
strong) structure of this language, has encouraged current 
research. Sanskrit is a very scientific language. Its entire 
grammatical mechanism is perfected. From this perspective, 
Sanskrit grammar studies have received serious attention about 
the truthful representation of communication 
worldwide. 

Fig.l. shows various areas of Sanskrit Literature addressed by 
computer. 



As a AI 
Language 




Grammar 
Reconstruction 
for Poetic 
Constructs 



Sanskrit 



Grammar 
Classification 
of Text 




Machine 
Learning and 
Translation Tool 



Fig.l. Various areas of Sanskrit addressed by computer 

The legendary Sanskrit grammarian of 5th century BC, Panini 
is the world's first computational grammarian. Panini wrote 
Ashtadhyayi (the Eight-Chaptered book) [15], which is 
considered to be the most comprehensive scientific grammar 
ever written for any language. Many approaches were proposed 
by scientists and grammarians world-wide, to extract the 
richness of Sanskrit language in various contexts. 
The Sanskrit poetry, comprising of Verse or Shloka is classified 
in terms of its Meter or Vrutta. Indian scholar and musical 
theorist Pingala, in his Chhanda Sutra, used the marks 
indicating long and short syllables to indicate meters or Vrutta 
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in Sanskrit poetry. More than 150 Vrutta exist in different 
poetic forms. Depending upon the rule based Vrutta 
identification in a given verse, the way in which the verse can 
be sung is decided. 

Rhythmic chanting of the Shloka or Verse enhances the 
capability of memorizing the Shloka. According to Mantra 
therapy when the Shloka 's are rhythmically chanted they have 
wonderful effect on our body. With this context difficulties are 
generally faced by common people who are unaware of these 
rules but want to sing the verse correctly. 

Our research specially contributes for addressing this problem 
by providing a system which will facilitate the people who are 
unaware of the construction rules of verse but want to learn 
them and sing the verse correctly. 

The paper is organized as follows: In section II we 
discuss related work in computational processing of Sanskrit 
language. Sections III and IV explain the designed system and 
implementation details along with the example. In section V we 
conclude the work and propose the future directions. 

II. Related work 

Rick Briggs [1] proposed how Sanskrit as a natural language 
can serve as artificial language also. He states comparison 
between semantic net and method used by ancient Indian 
Grammarians to analyze sentence. Parallelism between two is 
also analyzed. Consider the example of the sentence "John 
gave ball to Mary" The action involved is "to give", but there 
also exists the intermediate or auxiliary actions such as John is 
holding the ball in hand which is a starting point and the ball 
will go in Mary's hand which is the end point. 
Auxiliary activities (karakas) are stated in Sanskrit by means 
of seven case endings, i.e. agent, object, instrument, recipient, 
point of departure and locality. Consider example sentence as 
"Out of friendship Maitra cooks rice for Devdatta in pot, over 
a fire". 

In the triple form the sentence can be written as - 
Cook, agent, Maitra 

Cook, object, rice etc which is very similar to approach of 
computer processing. The Sanskrit sentence for the same is 
written as 

Maitrah: Sauhardyat Devadattaya odanam ghate agnina 
pachati. 

Also in both the language representations the activities are 
considered as events. For ex. instead of cooking it is 
considered that activity is going on which is cooking. 

Rama N. and Meenakshi Lakshamana [2] proposed the 
approach for the issue of rule based division of Shloka 
considering the fact that Sanskrit verse is sequence of four 
quarters. Each quarter is classified either by the number of 
syllables (akara-s) or the number of syllabic instants (matra-s). 
The determination of meters is based on either of these factors. 
Meters based on the first factor are called Narna meters, while 
those based on the second are termed jati meters. In Varna 
Meters, two types of syllables are present -the long (Guru) 
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and the short (Laghu). The algorithm is written which converts 
the Sanskrit verse into binary form considering only the long 
and short syllables in it. The second algorithm will classify the 
verse by splitting it in two parts and categorizing it in Sama, 
Ardhasama and Vishma meter. The disadvantage of this 
approach is, the number of meters in actual use across the 
literature is limited to a smaller number than the number 
theoretically possible. Hence, this work handles the meters in 
vogue, which indeed themselves constitute a sizeable quantity 
and pose non-trivial computational problems. 
Aasish P., Ratna S. [3] proposed the approach of analysis of 
Sanskrit grammar for Machine Translation and Tokenizer 
which provides solution for "Samaas Vigraha". For parsing 
Sanskrit sentence two major factors are considered regarding 
complexity of words. "Sandhi" which is combination of two 
words to produce new word and "Samaas" which is 
combination of two words depending on their semantics. 
While designing the parser, rules are to be defined which are 
based on some of the factors like Part of speech (POS), List 
of words, Case end and begin and Declension 
Some of the actions or functions are 

1) SetDecl (Declension case for specified token) 

2) Add before (add a string before a specified token) etc. The 
rules are stated for dissolving Compounds. The input to the 
parser "Vaakkriti" is a Devnagri text and output of the system 
is the set of tokens produced after Compound Dissolving. 
The above stated system will fail to produce the required 
output when a Sanskrit poem is given as input. A Sanskrit 
poem conveys more than one meaning and sometimes figure 
of speech is used, which makes it more complex. 

G.Huet [7] has proposed the method for Sanskrit processing 
by computers. The software is proposed which analyses the 
Sanskrit sentence depending upon the possible interpretations 
of Sandhi analysis. Sanskrit lexical database is constructed 
Two-tape transducer is modeled for Sandhi analysis. In 
Sanskrit text, as the words are not separated by blanks and 
punctuation symbols, but are merged together by external 
Sandhi. Thus segmentation is done. Further lexicon directed 
segmenter is extended into a tagger. 

Subhash Kak [9] describes the classification schemes for 
meters from Vedic age. Sanskrit meters are based on the 
system of short and long syllables, represented by 0 and 
1 .Meters has different lengths. In Chandashastra, Pingala have 
stated two basic schemes of representing meters which 
indicates the octal representation. The representation of verse- 
feet is given depending on number coding of three syllables, 
but order of bits is reversed from modern representation. In 
Katapayadi (KTPY) notation numerals are represented as 
letters of alphabet. It shows the irregularity of mapping the 
numerals above three, which is not present in Pingala' s 
mapping. The author has given the construction of the 
mapping behind Pingala scheme, analogues with KTPY 
notation, called as Katyasadi (KTYS) notation. 
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Anoop M. Namboodiri, PJ. Narayanan and C.VJawahar [13] 
have proposed a framework to use rich metric and formal 
structure of classical poetic forms for post-processing a 
recognizer like OCR engine. They have proposed the 
algorithm for processing of poetry. The proposed algorithm 
can be used in conjunction with other post processing 
approaches and for correction of modifier symbols, difficult to 
recognize for OCR. 

The existing approaches propose to mark Laghu and Guru but 
do not talk about the classification and generating musical 
notations for the input. 

III. Sangit Vrutta Darshika 

In ancient Indian Poetry total dominance exists in oral 
tradition. The reason behind this is ease of memorizing 
verses. To compose the lines of Shloka the rules are 
designed. The set of these rules form set of structures. These 
rules are known as Vrutta. Each Vrutta can be identified by 
unique pattern of letters or akshara. The Vrutta are 
mandatory rules in the poetry. Sanskrit Shloka is comprised 
of quarters or Charan. 

There are two broad categories of Vrutta which exists in 
Sanskrit Shloka: Gana Vrutta and Matra Vrutta. 

In Gana Vrutta each Charan in Shloka has similar number of 
letters, having same number of Laghu and Guru so it is also 
known as Akshar Gana Vrutta. 

In Matra Vrutta the number of letters in each Charan may not 
be identical, and each short syllable will be assigned value 1 
and long syllable will be assigned value 2. Depending on sum 
of Matra in each Charan the Vrutta will be identified. 
The designed system "SANGIT VRUTTA DARSHIKA" 
emphasize on the method of classification of Sanskrit Shloka 
depending upon identified Vrutta. The Vrutta we have 
considered falls in the category of Akshar Gana Vrutta. The 
functionality of the system can be understood by the block 
diagram given in Fig. 2. 

As shown in the figure, the input to the system is a Sanskrit 
Shloka in Unicode format. The identified Vrutta and Musical 
notations in text and audio format will be displayed as an 
output. 
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Fig.2. Block diagram of the system 

Mathematically the system is presented of a function f(x), 
where f(x) is a function for Vrutta identification and 
determination of suggested Musical notation for given Shloka. 
The objectives of this function are to Search Akshar Gana 
Vrutta from Shloka and to specify the suggested musical 
notation according to the Vrutta identified. The input Shloka 
has constraint that it must contain Akshar Gana Vrutta as 
classification scheme. Consider S be the system that describes 
the problem 

i.e. Let S= { {I}, {O}, Fv, Fn, Sc, F} 
Where 

I = [S lt S 2fM ...... S n } ____ Set of Sentences 

Each 5 * = { W 2f _ W n }-.— Set of Words 

Si E I V Wi separated by ' 6 

And ^ = € L i' L 2.... ... 1 Set of letters 

VL 

G i = { L i+i> L i+zM+3 > Where i=0, 3,6,9,12,15 
Assign L = 1 for Guru or Long syllable 
And L = 0 for Laghu or short syllable 

VL 

If 3 Li e {{LV}v {{SL} with ' i '} v {{SL} followed by ':'} 
{{SL} followed by tr t '} {{SL} followed by ' T §'}} 

Li=l(Guru) 

where LV = { 3tt 3TT f ^ ^ } 
SL = { cj> j§[ ?r } 



Input Shloka 
in Unicode 
format 



Shloka 
Preprocessing 



T 



Pattern 
Formation 



Pattern matching 




Unique 


for the given 




Identifier 


Shloka with the 




Generation 


Stored Pattern 


< 


for each 








Pattern 








f 




Mapping with 




the Musical 




notations 





Output (Identified Vrutta and Musical 
notations in text and audio format) 



Else if 3 Li G {{SV}v {SV followed by ' V} 
Li=0(Laghu) 
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where SV= {srf 3} 



If Laghu letter is followed by Jodakshar then consider that 
Laghu letter as Guru. (example: In the word "snW 'V is a 
"Jodakashar" and " 5 " is a Laghu letter. According to the 
rule stated it will be marked as Guru. "Jodakashar" is 
considered as Laghu and if followed by "visarga", 
"anusvar", "kana'V'dirgha velanti" should be considered as 
Guru.( for example: In the word "^r". "^r"will be marked as 
Guru as it contains Jodakshar followed by "kana") 
If Dg is database of 'Gana' 

^gi = {^gO'^gl> '^g?} 

Where 
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The sample words as examples of Gana are given in Table. 1. 
"U" is the symbol used for marking Laghu letter and "_ "is the 
symbol used for marking Guru Letter [14] 



TABLE. 1. LIST OF GANA WITH EXAMPLE 



A 



go = {000}=^t-000 
'gi ={001} =^-001 
={010}=^r-010 
g3= {011} = ^r-011 
100} = ^r-100 



gs = noi} = T -101 



^ = {110} = rT-110 
D S? = {111} = tt— 111 

VG 3Gi G G 

if Gi G {DgOVDglVDg2V VDg7} 

Where i=0 to 7 

Let V is database of Vrutta names. 

Where V= £ S ^ ^ ? 

s o = Shardulvikidit = {Pattern of Shardulvikridit} 
^1 = Mandakranta = {Pattern of Mandakranta} 
^2 = Bhujangprayat = {Pattern of Bhujangprayat} 
s 3 = Prithvi = {Pattern of Prithvi} 
s * = Shikharini = {Pattern of Shikharini} 
^5 = Stagdhara = {Pattern of Stagdhara} 
s e = Hansagati = {Pattern of Hansagati} 
■Sy = Vasanttilaka = {Pattern of Vasanttilaka} 
^8 = Malini = {Pattern of Mai in i} 
•^9 = Indravajra = {Pattern of Indravajra} 
if 3G E V 

G=is 0 \/s 1 v s 2 v v<r 9 } 

Let N be set of notations in Hindustani music notations in 
text and M is set of Music files having three different 
music files, wherein two files of different energy levels 
suitable for male and female voice recorded using violin 
and 1 vocal file of the popular Shloka or Subhashit in that 
particular Vrutta. 

Where*' = { N o>K±, N 9 } M = {M 0 ,M 1 M 9 } 

If Gi=Si, Display Si, Ni, Mi where i=0, 1, 2, — -, 9 



Gana 
Name 



Gana Formation 



U U - (011) 



(111) 



U (110) 



U_ (101) 



U_U (010) 



UU (100) 



UUU (000) 



UU_ (001) 



Example 



WW 



TOT 



Figure 3 illustrates a method of finding a Guru in the Sanskrit 
Shloka 
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Fig. 3. Rules of marking Guru 



Figure 4 illustrates a method of finding a Laghu in the Sanskrit 
Shloka 




Sanskrit Shloka (INPUT) 



Assign Laghu or Guru for each 
letter in Shloka 



Group three Laghu' s and/or Guru's to 
generate the pattern (Gana) 



1L 



Identification of vrutta from Gana 
Pattern matching 



Specify possible musical Notation 



Fig.4. Rules of marking Laghu 



iv. Implementation Detailes 



The overall flow of system implementation can be understood 
from the block diagram in Fig. 5 



Fig. 5. Implementation flow of the system 

The designed system accepts Sanskrit Shloka in Unicode 
format. Unicode provides a unique number for every 
character irrespective of the platform and the language. UTF- 
8 encodes each Unicode character. Ones the Shloka is stored 
in UTF-8 format, following process will be carried on: 

1 . Laghu or Guru is assigned to each letter according to 
grammar rules. 

2. The Shloka is divided in groups called as Gana, 
where each group or Gana consists of combination 
of Laghu and/or Laghu and Guru. Depending upon 
the Laghu, Guru assignments within a Gana, each 
Gana will be assigned a unique identifier. The 
identifier is a unique alphabet or akshara for a 
specific gana. 

3. Ones identifiers are assigned to Gana for input 
Shloka their pattern is checked with the specific 
Vrutta pattern. 

4. If the pattern matches then it's a success case and the 
Vrutta is identified as an output. The system can be 
explained with the help of following example: 

Consider the Shloka: 



The stepwise analysis of the above Shloka will be done as 
follows: 
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Mep l 










TFT T 






Step 2 




UU- 


u- u 


uu- 


--U 


--U 




Step 3 


11 1 


001 


0 1 1 


001 


11 0 


11 0 


1 


Step 4 






ST 











1. In step 1 the input Shloka is divided into groups 
called as gana. Each gana consists of 3 letters. 

2. In step 2 Laghu (U) and Guru (-) assignment for 
each gana is done according to following rules: 

Rules for marking Guru 

a. Short syllables (3) . . . sT ) followed by 

tO . T . TO . ^ } are considered as 
Guru. 

b. {f ^ ^ } are considered as Guru. 
Rules for marking Laghu: 

a. Short syllables (3> . . . sf) are marked as 
Laghu. 

b. Short syllables (3> . . . ff) followed by 
{f} are marked as Laghu. 

c. {f 3" } are also considered as Laghu. 

3. In step 3 all Laghu 's are assigned number 0, and 
Guru's are assigned number 1. 

4. In the next step a unique alphabet or akshara is as 

signed to each gana according to the order of 

Laghu' s and Guru's appeared in that group. 

The occurrence of akshara 's in the fixed order yields to 
identification of particular Vrutta. 

In the example explained above the Vrutta 'Shardulvikridit' 
exists, which can be identified by the fixed pattern of Gana 
identifiers 

{*T^^ cT cT^ } 

After identification of Vrutta the possible musical notation 
are displayed to the user and audio file is played according to 
choice of energy level of user. 

In Sanskrit Literature more than 150 Vrutta exist. For 
illustration, analysis of ten Vrutta' s by the designed system is 
shown in Table 2. The table shows the Vrutta 

names and Gana patterns for that Vrutta. 
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TABLE.2. LIST OF CLASSIFIED VRUTTA 



Sr. No. 


Vrutta Name 


Gana Pattern 


1 


Shardulvikridit 


TT ?T vtT 7T rT rT TT 


2 


Bhujangprayat 




3 


Prithvi 




4 


Shikharini 


*t 77 ;t ^ *r ?r 77 


5 


Stagdhara 




6 


Hansagati 




7 


Vasanttilaka 


*T «T 3f 77 77 


8 


Malini 




9 


Indravajra 


rT rT ^ 77 77 


10 


Mandakranta 


77 ^ 77 77 



The designed system will analyze the Vrutta and provide the 
output. The system gives the identified Gana, the Vrutta in 
Shloka. It also gives Musical Notations in Devnagri and 
according to choice of the user; the audio file of specific 
energy level will be played. The example of Vrutta 
identification is given below along with the snapshots. 
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Fig. 6. Input 



— program fommrsii 



Kil-C.<A"' 





lH l .,.n. 

UMM 




% •«***«*»PR0GRAM TO IDENTlRf 5ANSKRIT VRUTTA IN SH0UCA M,M "* w • , " - °D 



The shoika is: 

— T,- — ,-..'.1 l, 

The Ganas identified for shoika are: 

lit if^IiirR 

The Vrutta identified for shoika is: 
Hansgati Vrutta 
The Musical Notations identified for Vrutta is: 







CticHcunHgrijMv 
FfloHcuns^tiM 
Fwc&»cHcuiaj&«i 

FmHcuragitwij 














Fig.7. Output 



Fig. 8. Musical Notations of Vrutta Hansagati 



Fig. 6. Shows the main screen which facilitates the user to 
provide Sanskrit Shloka as input. The user can also select one 
of the pre entered Shloka stored in text file according to his 
choice. 

Fig.7. displays the identified Gana, Vrutta and Musical 
Notations in Devnagri for the given input. 

In Fig. 8. the suggested Musical Notations for entered input 
Shloka can be seen along with the audio output on violin 
instrumental representation. 

The following graph shows the number of Laghu or Guru, the 
number of Gana's for particular Vrutta and the time required 
for computation. 
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TABLE.3. RESULT ANALYSIS 



Analysis of the System 



] No of gana 

I time reqd in ms 



I No. of Laghu and Guru 



Indravajra 
Malini 



Vasanttilaka 



Hansagati 



Stagdhara ^ Q34 
Shikharini 



Prithvi 



Mandakranta 



Bhujangprayat ^ 025 
Shardulvikridit 




From the graph shown in Table. 3 it can be observed that the 
time required for identification of particular Vrutta depends on [12] 
the number of Laghu' s and Guru's in the Shloka. 



V. 



Conclusion 



Besides being a mathematical and scientific language Sanskrit 
is also helpful in speech therapy. Rhythmic chanting of Shloka 
creates melodious effect in body, known as Neuro-linguistic 
effect. Also meaningful chanting generates the effect called as 
Psycholinguistic effect. In this paper, system for identification 
of Vrutta is stated along with the suggestions for possible 
musical notation for particular Vrutta. This would be useful 
for the users who are unaware of the construct of Sanskrit 
Shloka and relationship between Vrutta and singing pattern of 
Shloka. The system would also be considered as a guide to 
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understand Akshar Gana Vrutta identification by Gana 
formation. Ten Akshar Gana Vrutta 's namely Shardulvikridit, 
Bhujangprayat, Prithvi, Shikharini, Stagdhara, Hansagati, 
Vasanttilaka, Malini, Indravajra, Mandakranta are focused on 
for identification. Along with identification of Vrutta possible 
musical notation, suitable for singing Shloka of particular 
Vrutta is suggested. The choice is given to the user to play the 
audio file according to his/her comfort of energy level of 
singing. The system can be further enhanced for other types of 
Vrutta. 
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Abstract — This paper represents designing & analysis of high 
bandwidth Connected E-H and E shaped microstrip patch 
antennas. RT Duroid 5880 dielectric substrate material is used to 
design these antenna. A simulation tool, Sonnet Suites, a planar 3D 
electromagnetic simulator is used in this work. To fed patch 
antennas, co-axial probe feeding technique is applied. The 
proposed antenna can provide impedance bandwidths are of 
50% and 56.25% of the center frequency. The result shows that 
return loss is under -lOdB. Applications for proposed antennas 
are specially in the satellite communications. 

Keywords- Bandwidth, Connected E-H shaped Patch antenna, 
Dielectric Thickness, E-shaped Patch antenna, Return Loss Curve, 
S-Band, Space communication. 

I. INTRODUCTION 

The rapid growing development in the area of wireless 
communication leads to the miniaturization of the device size 
along without compromising good operational capabilities. The 
antenna is one of the basic need for any wireless 
communication. To use antenna in the reduced sized 
communication device, the antenna structure should also be 
trimmed without affecting its quality of performance. In this 
regard, Patch antenna plays a vital role because of its low 
profile, light weight, low volume, conformability, low cost and 
easy to integrate with microwave integrated circuits [1]. The 
applications of patch antennas are many and they are Global 
Positioning System application, WiMax, mobile and satellite 
communication application, Radar and Rectenna applications 
etc. [2]. Microstrip patch antenna has also disadvantages are 
narrow bandwidth, excitation of surface waves, low efficiency 
etc. [1]. Many researches has already been done to improve the 
bandwidth and reduce the disadvantages of patch antenna. 
Different shaped patch antennas are proposed to overwhelm the 
limitations. This work designed two high bandwidth Connected 



E-H and E-Shaped microstrip patch antennas for S-band 
communication covering 2-4 GHz [3] used for 
Communications satellites, especially used by NASA to 
communicate with the Space Shuttle and the International 
Space Station etc. [3] to achieve good bandwidth as well as 
mitigate the problems. 

II. DESIGN PROCEDURE 

In this research paper, the designing of Proposed 
Connected E-H and E-shape microstrip patch antennas has 
been designed with dimensions W (34.9 mm) x L (28.7 mm) 
and W (37.1 mm) x L (31 mm). The width and length of the 
microstrip antennas are determined as follows [4]. 

Width Calculation (W) 



W = 



2f 0 



(1) 



Where C is the free-space velocity of light, £ T is the 
dielectric constant of substrate, f is the antenna working 
frequency, W is the patch non resonant width, and the 
effective dielectric constant is £ r ef f given as 

Calculation of Effective dielectric constant (£ re ff) 



z reff : 



£>-- 



L wi 



i 

z 



(2) 
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Where the dimensions of the patch along its length have been 
extended on each end by a distance AL, which is a function of 
the effective dielectric constant £ rg f ^ and the width-to-height 
ratio (W/h), and the normalized extension of the length, is 

Calculation of the Effective length (L 0 ^) 



TABLE I: Proposed Connected E-H shape Patch Antenna Design Parameters 



2 fo,/ £ reff 

Calculation of the length extension (AL) 



(>r e //+°.3) (y+ 0.264) 

A L =0.4 1 2h7 WW T 

Ce re// -0.2SBJ( T +0.BJ 



Calculation of actual length of patch (L) 
The actual length of the patch can be determine as 
L e ff =L+2AL 

III. GEOMETRY OF PATCH ANTENNAS 



(3) 



(4) 



(5) 



A. GEOMETRY OF THE CONNECTED E-H SHAPED 
PATCH 

The Connected E-H shaped microstrip patch antenna is 
simpler in construction. The geometry is shown in Fig. 1 with 
box wall port which is the most common types of port that use 
reference plane to removes the effects of the transmission line 
effect. Patch is designed and simulated over Sonnet Software 
is a planar 3D electromagnetic simulator. 

L = 28.7 mm 



W= 34.9 mm 




Figure 1 . Top view of the Connected E-H shaped antenna 

The proposed Connected E-H shape microstrip patch antenna 
design parameter is shown in Table I. 



Antenna Design Parameter 


Material / value 


Dielectric Material 


RT Duroid 


Dielectric Constant(s r ) 


2.2 


Loss Tangent 


9.0e-4 


Height of Substrate (Thickness) (h) (mm) 


1.8161 


Width of the Patch (W) (mm) 


34.9 


Length of the Patch (L) (mm) 


28.7 


Frequency of operation (GHz) 


3.4 



B. GEOMETRY OF THE E-SHAPED PATCH ANTENNA 

The E-shaped microstrip patch antenna is also simpler in 
construction. The geometry is shown in Fig. 2. Patch is also 
designed and simulated over Sonnet Software. 



L = 31mm 



W = 37.1mm 





Figure 2. Top view of the E-shaped antenna 

The proposed Connected E-H shape microstrip patch antenna 
design parameter is shown in Table II. 

TABLE II: Design Parameters of the Proposed E-shape Patch Antenna 



Antenna Design Parameter 


Material / value 


Dielectric Material 


RT 5800 


Dielectric Constant(s r ) 


2.2 


Loss Tangent 


9.0e-4 


Height of Substrate (Thickness) (h) (mm) 


1.04267 


Width of the Patch (W) (mm) 


37.1 


Length of the Patch (L) (mm) 


31 


Frequency of operation (GHz) 


3.2 
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IV. SIMULATION RESULTS 

In this research, two broadbanding techniques are the 
Connected E-H shaped patch and the E-shaped patch 
presented. The simulation results are represent below. Finally, 
the results are discussed. 

A. Proposed Connected E-H shape Patch Antenna 

The results are explained in terms of the return loss, input 
impedance. The current density on the antenna is also showed. 

1. Return Loss Curve 

The first important parameter which is helpful to calculate 
the bandwidth of the antenna structure is its Sll in decibel 
versus frequency. During this antenna feeding has been done 
at the point where the return loss is minimized. The return loss 
curve of the designed antenna is presented in Fig. 3, and 
minimum Sll level of -30.39 dB is shown in m3 caption. The 
figure shows that the antenna resonates at 3.4GHz band. 



2. Input Impedance Curve 



Sonne\ Response Viewer 



Output G.»ph Cur 



Cartesian Plot 

ZO = 50.0 



Connected E-H.soo 



[onjsi i| 



ml: 2.2 GH2 

10.19 |dB| 

mZ: 3.9 GHz 
-10.01 |.ifi| 

m3: 3.4 GH* 

30.39 |dB| 



Return Loss Curve for Proposed Connected E-H Shaped Patch 
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Figure 3. Simulated Return Loss curve of Connected E-H shaped patch 
antenna 

The bandwidth can be described in terms of percentage of the 
center frequency of the band. 

Calculation of Bandwidth 



Sonnet Response Viewer 



M« ".«■><-♦ -"T" '!.«•,.:. £•<;<... 



Smith Plot 
Impedance 
Marker Curves: 

vswn | ?.o i 
zn * so.o 



Smith Chart for Connected E-H Shaped Patch Antenna 



Cuiinr.clrd C-H O 

P11| -o- 

ml: 2.ZGH? 

Mh« II 309412 Hi 
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M<i|| U.J1S/29 Pti 
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ml: 3.4GH* 

Msg 0.030Z4S 111 
VSWT1I- 1.062377 




m3 3 40C 

U*fl=0 030245 PTu««=-177 083 

V8wm*t 0623 r? 



1 22GHI 
Uaa=0 309412 Pt»M#n« 651716 



Figure 4. Input impedance curve of Connected E- H shaped patch antenna 

The vswr circle is indicated by red circle where VSWR =2. 
The input impedance curve tells us the magnitude, phase angle 
and vswr of the input impedance of the antenna at the 
respective frequencies. 

3. Current density Diagram 

The physical meaning of current density distribution is that it 
is a measure how the antenna is producing a beam. 







Sonnet Current Density Viewer 


_ =1 1KB 






aton Project Window Hdp 





Connected E-H^on OXV Ma 



JXY Magnilutle 
Amps/Metet 
li l(J^_ 

7.38 m 

5.46 

5.53 



2.77 

1.84]H 

0.92 H 

0.00™ 



E 



CIick or orag mouse to readout data values 



BW = ^X 100 [5] 



(6) 



Figure 5. Current density diagram of the Connected E-H shaped patch antenna 
at 3.4 GHz 



Where F H —Higher Frequency, F L — Lower Frequency and 
F c = Center Frequency. 

Here Fl= ml = 2.2 GHz, F H = ml = 3.9 GHz and F c = m3 
=3.4GHz. So the obtained bandwidth is 1.5 GHz which is 
nearly 50% of the center frequency. 



B. Proposed E-shape Patch Antenna 

The following results are obtained for the proposed E- 
shape patch antenna. The results are explained in terms of the 
return loss, input impedance. The current density on the 
antenna is also displayed. 

7. Return Loss Curve 

During this antenna feeding has been done at the point 
where the return loss is minimized. The return loss curve of 
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the designed antenna is presented in Fig. 6, and minimum SI 1 
level of -33.3 dB is shown in m3 caption. The figure shows 
that the antenna resonates at 3.2GHz band. 




Figure 6. Simulated Return Loss of E-shaped patch antenna 
Calculation of Bandwidth: 

BW = ^^X 100 [5] (6) 

Here F L = m2 =2.1 GHz , F H = ml =3.9 GHz and F c = m 3 
= 3.2GHz. So the obtained bandwidth is 1.8 GHz which is 
nearly 56.25% of the center frequency. 



2. Input Impedance Curve 




Figure 7. Input impedance curve of E-shaped patch antenna 



3. Current density Diagram 

• Sonnet Current Density Viewer 



Fi(< fdrt View Plot Animation Pfojcct Window Help 




Che* or drag mouse to readout data value* || 1,CH| |Po>HHf 



Figure-8: Current density diagram of the E-shaped patch antenna at 3.2 GHz 

V. DISCUSSION 

The bandwidth increases as the substrate thickness 
rises[4]. Here, Thickness of proposed Connected E-H shape 
patch is higher than Proposed E-shape but the E-shape patch 
antenna obtained higher bandwidth. The Size of the Patch 
increases as the frequency decreases [4]. In this regard, the 
resonating frequency of connected E-H patch is slightly higher 
than E-shape. So the size of the proposed Connected E-H 
shape is lower than E-shape. The substrate thickness increase 
results reduces conductor & dielectric losses [4]. In this case, 
the E-shape patch has some conductor & dielectric losses. As 
the substrate thickness increases, the surface-wave power 
increases, thus limiting the efficiency [4]. On the other hand, 
as the substrate thickness increases, the quality factor Q of the 
patch decreases [4] and and it increases efficiency [1]. As a 
result, efficiency problem of Connected E-H shape has 
reduced slightly. As the substrate thickness decreases, the 
effect of the conductor and dielectric losses becomes more 
severe, limiting the efficiency [4]. For a substrate with a 
moderate relative permittivity such as E, = 2.2, the efficiency 
will be maximized [4] and dielectric constant 2.2 is used in 
Connected E-H and E-shape patch antenna. Finally the 
efficiency of both shape is maximum and has no conductor 
and dielectric losses as well as surface-wave excitation . 

VI. CONCLUSION 

In this research paper, the intent was targeted at improving 
the bandwidth of microstrip antennas constructed with 
dielectric material with higher dielectric constant. Two 
different patch antennas are presented, simulated and 
discussed for wireless communications specially space 
communication covering 2.1-3.9GHz and the simulated results 
compare between them. Resonant frequencies are found at 
vswr = 1.06 and 1.044 of Connected E-H shape patch and of 
E-shape patch respectively. The results obtained bandwidth of 
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Proposed Connected E-H shape patch is higher than any 
existing Connected E-H shape and also Proposed E-shape 
patch is higher than any existing E-shaped patch antenna. It 
was seen that the bandwidth of that proposed E-shaped patch 
is better than Connected E-H shaped patch antenna. 



Technology University, Noakhali, Bangladesh. His research 
interest includes Microstrip Patch Antenna, Wireless 
Communication Systems, Neural Networks and 
Communication Protocol. 



VII. FUTURE SCOPES 

1. Increase the bandwidth more by reduceing the patch 
antenna size with using higher dielectric constant of the 
substrate. 

2. Varying the feed elements to optimize the patch 
antenna. 
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Abstract — Encryption is used to conceal information from prying 
eyes. Presently, information and data encryption are common 
due to the volume of data and information in transit across the 
globe on daily basis. Image encryption is yet to receive the 
attention of the researchers as deserved. In other words, video 
and multimedia documents are exposed to unauthorized 
accessors. The authors propose image encryption using matrix 
transpose. An algorithm that would allow image encryption is 
developed. In this proposed image encryption technique, the 
image to be encrypted is split into parts based on the image size. 
Each part is encrypted separately using matrix transpose. The 
actual encryption is on the picture elements (pixel) that make up 
the image. After encrypting each part of the image, the positions 
of the encrypted images are swapped before transmission of the 
image can take place. Swapping the positions of the images is 
carried out to make the encrypted image more robust for any 
cryptanalyst to decrypt. 

Keywords- Image Encryption; Matrices; Pixel; Matrix 
Transpose 

I. INTRODUCTION 

Image processing is a method used to convert an 
image into digital form and sometimes, some operations are 
usually performed on it. These operations include rotate, 
resize, transform, etc. The aim of carrying out some operations 
on an image is to get an enhanced image or to extract some 
useful information from it that may be further used for some 
other purposes. An image is a type of signal dispensation in 
which input is image, like photograph and output may be 
image or some characteristics associated with that image. 
Usually Image Processing system includes treating images as 
two dimensional signals while applying already set signal 
processing methods to them. Images like text can be 
encrypted. 

Encryption transforms plaintext messages into 
ciphertext messages. In the earlier days, securing information 
is carried out on only text related information. But today, with 
the proliferation of video and multimedia documents on the 
Internet, there is need to also secure image documents from 
unauthorized access. Images are represented using pixel, 
which mathematically can be represented using matrices. In 



image encryption, encryption algorithm transforms an image 
into a form that cannot be recognized to be the original image. 
The authors here propose a new image encryption technique 
that would deploy matrix transpose to encrypt image pixel. 

II. RELATED LITERATURE 

Reference [1] stated that all images consist of pixels. 
These pixels may have values in double or byte. An image 
is represented, for all mathematical purposes, as a matrix. 
The matrix equivalent of an image of size NxM pixels is a 
NxM matrix, where each pixel corresponds to an element of 
that matrix. This is a two dimensional image. For a typical 
colour image like RGB image, the matrix representation 
will be three dimensional. The additional dimension is for 
Red or Green or Blue proportions in a two dimensional 
Grayscale image. 

Pixel is the smallest element of an image. Each pixel 
corresponds to any one value. In an 8-bit gray scale image, 
the value of the pixel is between 0 and 255 (2 8 ). The value 
of a pixel at any point corresponds to the intensity of the 
light photons striking at that point. Each pixel stores a value 
proportional to the light intensity at that particular location. 
In order to represent an image, pictures may be used to 
illustrate the meaning of a pixel. In a given picture, there 
may be thousands of pixels. These pixels add up together to 
form an image. When the image is zoomed, the image 
usually reveal some pixels division. Note that a digital 
image is composed of a finite number of elements, each of 
which has a particular location f(x, y) and value. F9x,y) 
represent the coordinates at x and y axis. These elements 
are called picture elements, image elements or pixels. Pixel 
is the term used most widely to denote the elements of a 
digital image [2]. 

III. IMAGE ENCRYPTION PROCESS 

A picture can be encrypted in the same way that text is 
usually encrypted. A sequence of mathematical operations on 
the binary data that comprises an image may be deployed to 
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carry out an encryption processes. This can be achieved by 
changing the values of the numbers contained in the image in 
a given manner. This scrambles the image and renders it 
unrecognizable. 

Reference [3] pointed out that a secure image encryption 
algorithm based on Rubik's cube principle uses two secret 
keys equal to the number of rows and columns of the plaintext 
image and that based on the principle of Rubik's cube, the 
image pixels are scrambled and then, XOR operator is 
applied on the rows and columns. 

IV. IMAGE ENCRYPTION SOFTWARE 



) International Journal of Computer Science and Information Security, 

Vol. 13, No. 6, June 2015 
security of image can be achieved by various types of 
encryption schemes. Different algorithms have been proposed. 
Among this, the chaotic based methods are considered to be 
more reliable and also promising. However, this technique is 
complex in nature. The chaotic image encryption can be 
developed by using properties of chaos including deterministic 
dynamics and unpredictable behaviour. There are three kinds 
of encryption techniques namely substitution, transposition or 
permutation techniques that include both transposition and 
substitution. Substitution schemes change the pixel values 
while permutation schemes just shuffle the pixel values based 
on the algorithm. In some cases both methods are combined to 
improve security. 



Today, some major computer operating systems come 
with some form of encryption software. For instance, 
Microsoft provides BitLocker as part of its encryption with 
Windows 7, while the Mac OS X comes with File Vault. 
Dropbox, PowerFoler, and Cloudfogger are online file storage 
systems that include encryption as part of their data security. 
Some encryption software allows images to be batch 
processed while others do not. Most encryption software can 
handle common image files such as BMP, TIF, RAW, PSD, 
and JPG. Some image processing software are open source 
and can be downloaded from the Internet freely. However, 
some are only available on payment of the marketer's agreed 
fee. One of the most popular image processing software is 
Matrix Laboratory (MATLAB). This is usually licensed. 
Reference [4] observed that the tried-and-true method of 
adding encryption to a picture is through steganography, 
which is the art of creating hidden images. In the digital world, 
this is done by methods like least- significant bits in bitmap 
images or flashing subliminal messages in a video stream. 
Steganography is very useful for putting digital watermarks in 
an image. Watermark is typically used to identify ownership 
of the copyright of such signal where it appears. It is most 
times used by software companies to prevent users from 
continuous free usage of such software. 

As digital audio, video, images, and documents are 
transmitted through cyberspace to their respective 
destinations, some individuals may choose to intercept and 
take this content for themselves. Digital watermarking and 
steganography technology greatly reduces the instances of this 
by limiting or eliminating the ability of third parties to 
decipher the content of the information [5]. 

V. SELECTIVE IMAGE ENCRYPTION 

In selective encryption, some contents of the image are 
encrypted. Encrypting only part of the entire image reduces 
the execution time. Consequently, selective encryption is 
sometimes called partial encryption. This algorithm provides 
security to the image and at the same time, some part of the 
image is visible [6]. 

Today, millions of images are transmitted in seconds across 
the globe and as such, the security of images is becoming a 
major concern to businesses across the world. Encryption is a 
solution to the security concern of transmitted images. The 



VI. MATRICES IN DIGITAL IMAGE 
ENCRYPTION 

Image encryption is a new phenomenon in the encryption 
process unlike text encryption which has been in existence 
from time immemorial. Several researchers have proposed 
some image encryption techniques. Digital images are 
recorded as many numbers. The image is divided into a 
matrix or array of small picture elements, or pixels. Each 
pixel is represented by a numerical value. Digital images 
have an advantage that they can be processed in many ways, 
by computer systems. 

Here, we are proposing the deployment of matrices in digital 
image encryption. 

Reference [7] pointed out that an identity matrix which is 
denoted as, In is characterized by the diagonal row of l's 
surrounded by zeros in a square matrix. When a vector is 
multiplied by an identity matrix of the same dimension, the 
product is the vector itself, Inv = v. 

VII. USING MATRIX FOR IMAGE 
ENCRYPTION 

Reference [8] noted that a transpose of a doubly indexed 
object is the object obtained by replacing all elements a lj with 
a J1 . For a second- tensor rank tensor, a 1J the tensor transpose is 

simply, a J1 . The matrix transpose, most commonly written 
is the matrix obtained by exchanging A rows and columns, 
and satisfies the identity 

The proposed algorithm for deploying matrix transpose in 
image encryption is as below: 

i. Divide the image into parts P i? i = 1, 2, 3 , n 

ii. Assign each part, pi to matrices, m i? i = 1, 2, 3 ,n 

iii. Read the picture element (pixel) of each matrix, Mi 

iv. Encrypt each matrix pixel by carrying out matrix 

transpose, Mi T 

v. Swap the positions of transposed matrices 
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vi. M?=?$ =1 {Miy 

vii. Output Mi T as an encrypted image 

The decryption process is the reverse of the encrypion 
algorithm as outlined above. This starts from the output and 
the carries out the inverse of the individual matrix. After this, 
the pixel values are decrypted and reordered accordingly to 
arrive at the original matrix pixel values. 



VIII. COMPARISON BETWEEN IMAGE 
ENCRYPTION USING MATRIX TRANSPOSE AND 
SELECTIVE IMAGE ENCRYPTION 

Selective image encryption technique encrypts only some 
part of the image leaving other parts unencrypted. This gives an 
adversary the advantage to use the part of the unencrypted 
image to easily recognize the original image. There is no doubt 
that selective image encryption technique may be easier to 
implement, but since security of image is of utmost importance, 
the matrix transpose image encryption is more robust in terms 
of securing images to be transmitted across the globe. 

IX. CONCLUSION 

Encrypting images is as important as encrypting text 
messages. Today, text message encryption has had a fair deal in 
terms of researches carried out in that area. The same cannot be 
said of image encryption. For now, only few researchers are 
interested in image encryption and as such, there are limited 
literatures. The proposed image encryption technique that 
deploys matrix transpose allows the encryption of the entire 
image, unlike the selective image encryption that encrypts only 
part of the image. 
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Abstract — Handheld device systems have been used as tools for 
teaching people with special needs due to cognitive function 
enhancement by utility of multimedia, attractive graphics and 
user-friendly navigation. Can a handheld device system, such 
as cellular phone, be used for teaching illiterate people? This 
paper explores and exploits the possibility of the development 
of an educational mobile system to help the illiterate people in 
Egypt. 

Index Terms — Graphical User Interface; Audio; Graphics; 
Video, Wireless; Mobile System; Arabic alphabet; Arabic 
speaking illiterate people; illiteracy. 

I. Introduction 

Literacy can be defined in many ways. The U.N. defines 
a literate person as someone who can "...with 
understanding, both read and write a short simple statement 
in his or her everyday life" [19]. Learning the alphabetic 
letters could be more difficult than numbers for illiterate 
people [14]. 

Although the number of illiterate people around the 
world is estimated to be 800 million, they still can use the 
mobile appropriately. For the best knowledge of the authors, 
little research has been done to understand the reasons 
behind that. Most of them are from developing countries and 
females represent a high percentage of the 800 million [2] . 
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In Egypt, the total number of illiterate people aged 10 
years or more has exceeded 16 million in 2012, according to 
the Egyptian Central Agency for Public Mobilization and 
Statistics (CAPMAS) [7]. According to [8], there exist 
112.81 mobile phones per 100 Egyptian citizens. 

The flexible business model of mobile phone has proved 
to be viable particularly in developing countries. Despite 
infrastructural shortcomings, high cost of ownership, limited 
power available for charging devices, mobile devices had 
been widely penetrated the society at all levels [2]. 

The nature of current technological advances in the 
mobile phones domain generally suggests the future 
decrease of the cost of smart phones for customers in 
general. That includes the customers of developing 
countries. Recently, in the Egyptian market, there are cheap 
Chinese versions of Android based devices. In the future, it 
is expected to become more affordable to lower income 
segments. 

With the international effort to eliminate illiteracy, the 
problems related to inequalities have deepened. For 
instance, in Egypt, children of different social backgrounds 
do not have equal opportunities to learn and reap benefits. 
Furthermore, they are trapped and cannot get out of the 
vicious circle of poverty. 

Egypt has recognized that illiteracy is one of its core 
pillars to develop. Despite the effort that has been done in 
past decades under different governments, education 
remains a challenge. Even though the percentage of 
illiteracy is decreasing, the number of people struggling to 
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read and write is increasing. This is a sign or indication that 
there are a few issues regarding the implemented 
educational policies. One of these issues is the approach 
and the way they teach the students. 

In a previous work of [15] [16], the authors proposed a 
system to teach deaf people using Cell phones technology. 
In this paper, the authors will expand and re-use their 
previous work of teaching deaf people into teaching the 
Arabic alphabet to Egyptian illiterate people. 

The authors found online products that teach Arabic 
alphabet to nonspeaking Arabic people [4] [5] [6]. To the 
best knowledge of the authors, they did not find studies 
centered on teaching the Arabic alphabet to Arabic speaking 
illiterate people using mobile systems. The authors realize 
that the problem of teaching the Arabic alphabet to Egyptian 
illiterate people should be divided into two steps: 

• Gathering baseline data of how illiterates recognize 
and react with the mobile interface, 

• Based on the information gathered from the previous 
step, a suggested system could be developed. 

The study will concentrate on the first past and will pave 
the ground for the second part in a sequel paper. 

II. Data Gathering 



The objective of this section is to gather baseline data 
about the effectiveness and the usability of the mobile 
interface. The two experiments are conducted using a 
Samsung device, running an Android operating system that 
contains the ePhone application; see figure 1. 

As authors mentioned at the introduction section, the 
target of this study is to gather baseline data of how 
illiterates recognize and react to the mobile interface. This 
empirical study involves five novice illiterate participants. 
All users have no previous background of using mobile 
phones. Some other empirical studies involve only seven 
novice participants [1]. For some empirical study 
investigations, the baseline data is more important than the 
number of participants. The baseline data will be used for 
further investigations that involve more participants. The 
profile of the five novice illiterate participants is shown in 
table 1. 



TABLE I. 
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ILLITERATE PARTICIPANTS CHARACTERISTICS 




Participants 


Experience level 


Age group 


Gender 


PI 


Novice 


21-34 


Male 


P2 


Novice 


35-65 


Female 


P3 


Novice 


12-20 


Female 


P4 


Novice 


21-34 


Male 


P5 


Novice 


35-65 


Female 



A. First Experiment 

The experiment will start with an introduction 
explaining what tasks needed to be performed by the 
participant. The tasks are: 

• First, dialing a specific number, 

• Next, talking for a few seconds, 

• Last, exit the call. 

The results of the tasks for the five participants, see table 
1, are shown in figure 2. The results in figure 2 show the 
dialing and calling (i.e. talking) times in seconds for every 
participant plus the average. It is clear, from figure 2, that 
the dialing time is substantially greater. 




Dialing Time 
I Calling Time 



PI P2 P3 P4 P5 Average 



Figure 2. The results of the tasks for the five illiterate participants plus the 
average of the dialing time and calling (i.e. talking) time. 

After the participant finishes the experiment, a general 
feedback will be discussed with the participants. The 
general feedback discussion will focus on: 

• How well do participant read and understand icons? 

• Which icons were problematic and why? 

• What participant thinks of the overall performance of 
the application? 

B. Second Experiment 

The experiment will start with an introduction 
explaining what tasks needed to be performed by the 
participant. The tasks are: 



Figure 1 . Smart phone mobile numbers call interface 



• First, start a game, 
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• Next, go through the levels until reaching the results, 

• Last, start another round. 

After the participant finishes the experiment, a general 
feedback will be discussed with the participants. The 
general feedback discussion will focus on: 

• How flexible is the navigation/usage of the developed 
application? 

• Does it satisfy the needs and requirements of the test 
subject? 

• Does it provide an understandable interface of 
minimal knowledge requirement? 

• What possible usability/understanding errors could 
arise from the test? 

• How tolerable are these errors, and how can they be 
fixed in favor of higher usability? 

• Are the test subjects able to go through the 
application from start to finish seamlessly? 

The results of the tasks for the five participants, see table 
1, are shown in table 2, and table 3. The results in table 2 
show the type of errors that every participant committed 
when performed a certain task. The type of error is 
described in table 3. 
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TABLE III. Types of errors and its mitigation 

FOR THE SECOND EXPERIMENT 



TABLE II. 



Results of the second experiment 



Participants 


Tasks 


Type of 
errors 


Time of each 
interface 
(Approx.) 


PI 


Start The Game 


0 


1 second(s) 




Playing The Game 


3 


4 second(s) 




Reach Results 


2 


2 second(s) 




Screen &Start 








Another Round 






P2 


Start The Game 


0 


1 second(s) 




Playing The Game 


0 


10 second(s) 




Reach Results 


0 


3 second(s) 




Screen & Start 








Another Round 






P3 


Start The Game 


0 


1 second(s) 




Playing The Game 


2 


3 second(s) 




Reach Results 


2 


5 second(s) 




Screen & Start 








Another Round 






P4 


Start The Game 


0 


1 second(s) 




Playing The Game 


0 


5 second(s) 




Reach Results 


0 


2 second (s) 




Screen & Start 








Another Round 






P5 


Start The Game 


0 


1 second(s) 




Playing The Game 


0 


6 second (s) 




Reach Results 


0 


3 second(s) 




Screen & Start 








Another Round 







Type 


Error 


Mitigation method 


Level 


1 


Inefficient Click 


the finger size of the 
participant are to be 
considered 


Irritant 


2 


Wrong Answer 


Installing a voice 
narrator that instructs 
the participants 


Irritant 


3 


Rapid Clicks 


Adding a sound effect 
to their clicks to 
quickly adapt to the 
program 


Moderate 



C. General Feedback 

The general feedback discussion, of both experiment 1 
and experiment 2 with the five participants, shows that: 

1) Small icon confusion: The participants did not 
know where they should click on to create a new contact, 
after hesitating for a few seconds and searching for the icon 
due to its size, they finally identify the icon that will lead 
them to the new contact page. 

2) Multiple clicks: The participants clicked twice on 
the send button in order to send the message. The first time 
the participants pressed on the button while the screen 
keypad was opened. The participants attempted once more 
after closing the keypad. 

3) Recognition delay: The participants press on the 
image, thinking it is a button. However, they immediately 
realize what is it? Hence, they click on the actual call 
button. When participants were attempting on exiting the 
application, they took a while to recognize which icon 
performs such action. Since it is the only icon that has no a 
description. 

4) Small icon recognition: Delay leading to hesitation. 
The icon made for creating a new contact is relatively small 
in comparison to other buttons. The participants invest quite 
some time to search for the icon in order to create a new 
contact. They are hesitating by going back and forth in the 
application page, in order to search for where the task 
should be performed? 



III. Proposed System 

Users interact with mobile applications through different 
graphical user interface GUI components such as buttons, 
icons, or nested menus . . .etc. 

Controversial user interface (UI) topics include the 
issues of inclusion [18] or exclusion [3] [10] of text labels. 
Moreover, they use drawings [9] instead of icons. Common 
UI components - the concept of soft-keys, vertical 
scrollbars, short text labels [11] [12] and the concept of a 
focus in lists [20] - were described as hard to understand 
[2]. 

Chipchase's work [13] shows that illiterate users could 
perform tasks such as turn on their phones and accept 
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incoming calls, whereas dialing numbers to make outgoing 
calls proved difficult for some. However, simple tasks such 
as changing the clock or sending a message could be easy 
for some illiterate users by memorizing the steps. In that 
respect, it is important to understand the causes of all these 
problems while interacting with the mobile. We should take 
into consideration that the mobile devices will be rapidly 
penetrating the market of developing countries targeting the 
majority of population and could help support the education 
of illiterate people. 

The participants in the experiment mainly faced critical 
errors which would either lead them to invest too much of 
their time in order to perform a particular task or even end 
up discarding the task. The time spent on each task 
exemplifies the delay that challenged participants face to 
perform a particular task. 

To design an interface for illiterate Arabic speaking 
people, a number of changes need to be considered in the 
GUIs. It has been recommended to: 

• Avoid long text, i.e. minimization of text reliance. 

• Exposure to text in conjunction with audio 

• The developer should state underneath the symbol or 
using yellow tool tip text to explain what this 
particular button does. That condition is suitable 
more for barely educated than illiterate. 

• Excessive use of pictures, shapes, handwriting, 
special signs, and colors. Extremely value audio and 
graphic support in GUIs for illiterates 

• Increase the size of the icon and clarify what each 
icon is used for. Illiterate persons may not understand 
the symbol of the icon. The users were able to read 
and recognize nearly all icons, except the "create new 
contact" icon, which resulted in quite some delay 
time to the user. The problem behind that icon was 
that it was too small and did not have a statement of 
some sort stating what this icon does. 

• Aim to use more recognizable icons instead of 
menus; the proposed system should require the least 
possible amount of memorization for the illiterate 
users. 



IV. Design The Proposed System 

Based on the previous results of [14] and [17], a mobile 
application will be designed. The mobile application 
consists mainly of five components as shown in figure 3: 

1) Page Loader: contains the list of games; 

2) Data Keeper: This is the Game Engine. It has a 
score counter that counts the time, number of 
mistakes and the number of correct answers. The 
game also stores if the user has selected the 
correct answer from the first time; 
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3) Game Keeper: which is responsible for reading 

the score achieved in the game and storing it in 
the database; 

4) Performance Tracker: which is responsible for 
reading stored score information and displaying 
them according to the attempts made by the user; 

5) Audio Player: which is responsible for playing 
suitable audio files related to the opened page, in 
addition to providing audio feedback to the user 
after playing the game? Without any loss of 
generality, from now on all snapshots of the 
mobile application will have audio interaction 
between the user and the mobile application, 
even if not explicitly mentioned. 

6) Multimedia Generator: which takes as input 
Arabic text and utilizes Natural Language 
Processing techniques to classify the text and 
retrieve multimedia elements (i.e., images and 
videos) related to the text. 




Page Data Game Performance Audio Multimedia 



Loader Keeper Keeper Tracker Player Generator 

Figure 3. System components (need to add component of multimedia 
generator) 

A. System Architecture 

The system is composed mainly of two parts: the 
application server that contains all mobile resources (e.g., 
pages, games, database, etc.) and the mobile application 
which sends queries to the server to load the required 
resource elements, as shown in figure 4. 




Figure 4. System architecture 
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B. Snapshots 
• Snapshot- 1 

When the mobile application is launched, the following 
main screen will be displayed which allows the user to 
access the basic buttons and listen to the recorded voice 
associated with them, as shown in figure 5. 







IT 




0 1 







Figure 5. Snapshot- 1 



Snapshot-2 



By clicking the icon of the boy playing football; on the 
top row in Snapshot 1, the screen shown in figure 6, will be 
displayed. The icon in the upper row can be clicked to go 
back to the previous screen. 




Figure 6. Snapshot-2 



Snapshot-3 



When the user clicked the icon "possible" (Arabic: 
Momken), it is possible for him to click he icon "drink" 
(Arabic: Ashrab) located in the middle, therefore, a 
collection of drink flavors will be displayed to allow him to 
select the flavor he wants, as shown in figure 7. 




Figure 7. Snapshot-3 

• Snapshot-4 

Various icons that have particular meaning will be 
displayed in the mobile application, if the user pressed and 
held any button for two seconds, a loading progress will 
appear on the button to play an explanation voice record 
which explains the usage of the button, as shown in figure 8. 







Figure 8. Snapshot-4 

• Snapshot-5 

When the user enters the learning mode in the application, 
it will load the progress of the user stored in the mobile. 
The mobile application starts by reading available options 
by playing audio files and waits for the user to choose one: 
1) study the Arabic numbers; 2) study the Arabic letters; 3) 
input Arabic text to retrieve multimedia elements and 4) 
play games to evaluate what the user had learned, as shown 
in figure 9. 



it 








V 




n 







Figure 9. Snapshot-5 
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• Snapshot-6 

Before starting the game, the user can watch an 
animated cartoon lesson to learn about the specified topic. 
Each lesson does not exceed three minutes duration to keep 
the user's attention. Snapshot 5, shown in figure 10, shows a 
mathematic lesson to perform basic objects enumeration. 




Figure 10. Snapshot-6 



• Snapshot-7 

After watching the lesson, the user can play games 
related to the lesson for self evaluation. The following 
screens A and B , shown in figure 11, show two different 
game evaluation pages to evaluate what the user learned 
about the enumeration of objects. In screen A, the user is 
asked about selecting the appropriate number that is 
presented by the hand; in screen B, he is asked to sort 
numbers drawn on eggs in the right order. 




Figure 1 1 . Snapshot-7 



• Snapshot-8 

The following screens A and B, shown in figure 12, is 
another game pages to evaluate the user and what been 
learned about the order of the week days. In screen A, the 
user is asked to sort days by selecting the appropriate day 
written on the mushroom. In screen B, the user is required to 
see which day the boy is asking for, and select the 
appropriate day written on each leaf. 




Figure 12. Snapshot-8 



• Snapshot-9 

The screens A and B, shown in figure 13, are pages used 
to teach the user Arabic letters. The application recite 
Arabic letters, as shown in screen A, then several examples 
are given for each letter, as shown in screen B. 




Figure 13. Snapshot-9 



• Snapshot- 10 

The screens A and B, shown at figure 14, represents a 
game that evaluates the user through various questions in 
different styles about what had been learned about Arabic 
alphabetic letters. In screen A, the user is asked to select the 
image where its name starts with the presented letter. In 
screen B, the user is requested to connect the appropriate 
letter with its corresponding image that its name starts with 
the presented letters. 




Figure 14. Snapshot- 10 



• Snapshot- 11 

After watching the adding process lesson, the user can 
play games related to that lesson for self evaluation, as 
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shown at figure 15. The user is asked to select the correct 
answer from the presented numbers. 




Figure 15. Snapshot- 11 



• Snapshot- 12 

In the snapshot 12, shown in figure 16, an animated 
cartoon lesson that teaches a lesson about week days. It 
gives full explanation how to pronounce days in order and 
how they are ordered. The lesson explains how days are 
sorted by constructing and connecting train parts in order 
according to the written day on each part. 




Figure 16. Snapshot- 12 

• Snapshot -13 

Snapshot 13, shown in figure 17, demonstrates an 
example of the feedback a user would get after entering 
Arabic text into the system that translates as, "A rabbit 
has brown or white fur. It eats carrots and moves around 
by jumping." 




Figure 17. Snapshot- 13 



Vol. 13, No. 6, June 2015 

• Snapshot- 14 

After playing the game, feedback results will be displayed 
for the user in both written and audio forms as shown in 
figure 18. The score is divided into three fields: 

a) Completion time (in ms), 

b) Number of correct answers from first attempt, 

c) Number of wrong answers. 




Figure 18. Snapshot- 14 

V. CONCLUSION 

A number of researches concurred that the current 
mobile phone user interface design is not highly 
recommended or suitable for illiterate individuals [14]. 
They found that utilizing audio and graphic support in user 
interfaces is highly valuable for the enhanced cognition and 
usage friendliness for illiterate people [14]. 

The study of the relation between illiterate, semi-literate 
people and their society in the context of using mobile 
phones is still at its beginning stages. The rapid technical 
development and the changing market of mobile phones 
certainly increase opportunities for illiterate individuals in 
terms of cognition and communication. 

The use of enhanced GUI systems, complemented with 
multimedia support such as audio, image and video enhance 
the usage experience for people with literacy related 
challenges. In this study, the authors paved the ground for 
the proposed system to be investigated in a sequel paper. 
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Abstract: Friendly interface is necessary to make the system more efficient and effective. The development of Urdu 
recognition is key element of research as it provides an efficient and natural way of input to the computer. This paper 
presents a framework based on Urdu layout and recognition of handwritten digits and text images by using different 
techniques. After the survey on Urdu documents the following conclusion is made regarding the Data set, Techniques 
and algorithms that the most widely used technique is HMM and Data set involves the training set which contains 
different image styles and sizes and also hand written text. 



Keywords: HMM, Urdu documents, Rule based Approach 

I. INTRODUCTION 

In order to make the system more and more 
efficient there should be friendlier interface so that 
the user can with no trouble intermingle with the 
computer. However many researches are made to 
make human computer interaction increasingly 
responsive. The improvement of Urdu recognition 
is the key element of the research as it provides an 
efficient and natural way of input to the computer. 
The natural language of Pakistan is also Urdu and 
is articulated in more than 22 states containing 
almost 60 million native speakers. It contains 38 
alphabets out of which 17 have dots either above or 
beneath them. Urdu script is written from right to 
left. The most popular script of writing Urdu 
language is Nastaleeq, developed from two 
different scripts; Naskh and Taleeq. This paper 
presents a framework based on Urdu layout and 
recognition of handwritten digits and text images 
by using different procedures. The capability of a 
computer to understand handwritten and scanned 
document is important as it can yield efficient 
research. 

The process of ascertaining layout 
arrangements by investigating page images is 
called "layout analysis". It can be physical or 
logical. Here we present a layout analysis system 
for Urdu documents images by extracting text line 
in reading order. The hand written numeral 
recognition has problem of similarity between 
handwritten numerals and dual style for Urdu. 

Image understanding is concerned with 
the taking out of semantic information of a 
document. In order to steer through documents 
Table of Contents (ToC) is being used which 
enables a person to steer through large volume of 
scanned pages competently. This paper is offering 
a quick analysis on various approaches in the area 



related to ToC extraction. ToC page detection, ToC 
parsing and to link the actual pages with these 
recognized parts are the three areas in which ToC 
research can be distributed. 

In the paper we have presented different 
recognition techniques including Word-level, 
HMM, complete level, Annotation, for hand 
written text images. BPNN for offline hand written 
Urdu digits and for online STNN, OLUCR, and 
Tree based Dictionary Search, Intuous Wacom 
Board is used. Neural Network for OCR Urdu 
script and for pattern matching approach 
Morphology Technique has used. Smoothing 
Technique, bigram NER tagger are used for multi- 
font numeral recognition for Urdu script. Fuzzy 
linguistic, HMM, Hybrid Approach for extraction 
of named entities (NEs) from the text.//MM and 
Hybrid approaches are presented for both domains 
of multi-font numerals recognition. Database 
retrieval approach has used in word spotting in 
scanned Urdu documents. We have mentioned the 
problems in the recognition of handwritten and 
scanned text images in Urdu script and the solution 
by applying the above mentioned techniques in the 
proposed approach. 

Remaining paper is prescribed as follows: 
Section II defines the associated work done on 
Urdu document, section III describes the analysis 
of all the research papers on which survey is 
conducted, section IV contains the conclusion and 
section V describe the future work and then at the 
end reference is provided. 

II. LITERATURE REVIEW 

This section encloses the brief explanation of all 
the research papers which are analyzed. 

A) Online Urdu Character Recognition 
System [1] 
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New System is introduced which focuses on 
the Online Urdu Character Recognition using 
"Segmentation free Technique" that is the 
recognition of one complete word, in spite of every 
single word, these words when combined together 
formulates the complete sentence. The technique 
mainly involves BPNN (Back Propagation Neural 
Network) for training the dataset. Although the 
proposed system is very efficient, but there is a lot 
more to discuss, hence the future can still do better 
advancements in this field of Human Computer 
Interaction. 

B) Optical Character Recognition System for 
Urdu (Naskh Font) Using Pattern 
Matching Technique [2] 

This research paper focuses on the offline 
OCR for Naskh font in Urdu language. The new 
system is announced which works on 
corresponding pixel values of the models already 
put in storage with pixel values of those character- 
images to be renowned. The Pattern matching 
method for Optical Recognition, on which the 
dataset is trained, is described for the proposed 
system. The algorithms used for the Recognition 
System are listed, which are as follows, Chain 
Code Calculation, Line Segmentation and 
Character Segmentation. 

C) A Framework for Word Spotting In 
Scanned Urdu Documents by Exploiting 
the Dot Orientation [3] 

This paper presents a data reduction 
framework in Urdu scanned documents, based 
upon exploiting the dot orientation for word 
spotting. Due to the higher number of dots in Urdu 
alphabets (as compared to English) and the ease of 
calculation, the dots orientation was proved to be a 
good choice for word- spotting, which was 
demonstrated in the paper. The proposed algorithm 
for the system implements five phases, which are 
as follows: Document Tilt Removal, Dot Spotting, 
The Dot Character Database, Text Size Variation 
and Word Spotting. The algorithm was applied to 
different documents and results were generated. 

D) OCR-Free Table of Contents Detection in 
Urdu Books [4] 

This paper reports an initial struggle to address 
the task of identifying old documents' TOC that 
cannot be operated using OCR technologies. The 
research presented in the paper is all about dealing 
the TOC page detection through OCR free 
algorithm. The suggested algorithm is a 
combination of rule-based techniques and machine 
learning and it feats the precise characteristics of a 
distinctive Urdu TOC page. The proposed 
algorithm is evaluated on Urdu books and digests. 
Submission of such algorithms may comprise off- 
line and/or on-line digital libraries of cursive 
writings. 

E) Choice of Recognizable Units for Urdu 
OCR [5] 



The research paper proposed a numerical 
examination of Urdu corpus to assemble and 
organize the Urdu cords. To reduce the class count, 
the ligatures with similar primary components are 
clubbed together. Initially the Urdu word is 
fragmented into ligatures and remote characters for 
character segmentation. The ligatures are then 
further segmented into characters. It is mentioned 
that for developing the complete ligature 
recognition system, there should be an 
identification of all likely main and subordinate 
associated components. 

F) An Annotated Urdu Corpus of 
Handwritten Text Image and 
Benchmarking of Corpus [6] 

The methodology proposed in this paper is to 
design and produce Urdu corpus consists of 
complete Urdu text sentences. Measurements of the 
Urdu corpus comprise database in handwritten text 
forms. To captures the supreme syntactic 
distinctions, forms will be occupied by different 
authors having varied upbringing and belonging to 
diverse geographical positions. The benefit of the 
Proposed corpus is that it would provide facilities 
to further add more words by same procedure of 
markup where all annotation information will be 
entered manually during the insertion of new 
handwritten text form. 

G) Automatic Recognition of Offline 
Handwritten Urdu Digits In 
Unconstrained Environment Using 
Daubechies Wavelet Transforms [7] 

For the handwritten Urdu Digits an OCR system 
has been presented in this paper. The approach 
used in this paper include the major function of a 
design recognition system is to produce decisions 
regarding the class membership of the designs with 
which it is challenged. In this work, various 
Daubechies Wavelet Transforms have been applied 
to excerpt the wavelet factors. 
The recognition accuracy is enhance by the use of 
this approach 

H) The optical character recognition of 
Urdu-like cursive scripts [8] 

This paper establishes one of the infrequent 
exertions in amassing the works concerning Urdu- 
like script recognition with distinct reference to the 
Nasta'liq and Nashk script formats. We can 
summaries the whole survey of the paper as a huge 
set of characters and resembled-shaped-characters 
make the case of the Urdu-like scripts more 
multifaceted and puzzling. The offline character 
matching is perhaps difficult than its online 
counterpart as not more information is accessible. 
The approach established significantly precise 
results with many documents, such as newspapers 
and books. The advantage of the theory is that it 
may not only reduce the lexicon but also help us to 
build a multilingual OCR. 
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I) N-gram and Gazetteer List Based Named 
Entity Recognition for Urdu [9] 

This paper has presented a statistical Named 
Entity Recognition (NER) system for Urdu 
language using two basic n-gram models, namely 
unigram and bigram. This work presents a 
statistical approach using n-gram for Urdu NER. 
The objective of this NER system is to recognize 
five classes of NEs; Person, Location, 
Organization, Date and Time. A transitory review 
of different procedures used for the NER task in 
diverse languages is shown. In this paper 
significant results have been produced even with a 
small sized training data. 

J) Multi-font Numerals Recognition for Urdu 
Script based Languages [10] 

The resemblances and differences between 
these two scripts old Arabic and Urdu has 
presented in this paper from the character 
recognition viewpoint. Rule based technique, 
HMM and Hybrid approach is presented to 
distinguish the online digit identification written in 
both Arabic and Urdu forms from both online and 
offline. The suggested technique work for numbers 
input. In this paper the difficulty of parting of Urdu 
and Arabic numeral has solved. 

K) Segmentation Based Urdu Nastalique 
OCR [11] 

To explore system based upon segmentation 
this is capable of recognizing Urdu Nastalique font. 
The main concern of this paper is on the 
development of OCR by using Hidden Markov 
Model, because it can accurately handle large data 
sets and can be qualified to grip noise plus 
distortion to some extent, and rule based post- 
processor. The system takes a monochrome 
scanned image. Few letters are not recognized 
accurately because the technique still needs to be 
tested on real data and extended to cover the entire 
set of Urdu letters at a variety of font sizes. 

L) An Efficient Method for Urdu Language 
Text Search in Image Based Urdu Text 
[12] 

A simple and healthy technique of discovering 
a character in Urdu text images is presented in this 
paper. The method which has proposed is 
independent of script. Initially image is matched 
with a set of example characters demonstrating 
each class. The space between every input image 
and each example character is calculated, and the 
character is allocated to the class of the trial 
product generating the perfect match. Results 
describe template matching technique can be 
applied to discover a character or whole ligature 
inside an image accurately. 

M) Combining Offline and Online 
Preprocessing for Online Urdu Character 
Recognition [13] 

In this research paper a new technique is 
offered for compiling of Urdu online text in which 



equally online and offline sphere are applied to take 
out the variations and to enhance the competence 
for online input of the recognition system. In this 
paper different techniques are performed on the 
input hits from both offline and online views. This 
involves stroke segmentation, de-hooking, 
interpolation, combine strokes, smoothing and base 
line. The efficiency can be increased by using the 
joint processing for online and offline 
preprocessing strokes are converted into image to 
achieve offline preprocessing steps. 

N) Layout Analysis of Urdu Document 
Images [14] 

For Urdu documents a layout system is 
described in this paper. This method had shown as 
dealing perfect on Roman draft so it was modified 
to Urdu documents. The assessment of the 
algorithm is completed in two steps. The first step 
evaluates the errors which are in text-line, and the 
other part calculated the reading order algorithm. 
Newspaper documents demonstrated to be the 
hardest class giving several tasks as compared to 
others. 

O) Challenges of Urdu Named Entity 
Recognition: A Scarce Resourced 
Language [15] 

In this research paper a brief overview of 
Named Entity Recognition system is described. 
The process of searching the text to detect entities 
in a text and to categorize them into already 
defined classes such as the names of organizations, 
locations, expressions of times, persons, quantities 
is called NER. Urdu NER task has not been 
thoroughly investigated or experimented with due 
to scarce resources and the inherent complex 
features. Hence Urdu language demands detailed 
investigation regarding the application of different 
existing techniques employed for NE in different 
languages. 

III. ANALYSIS 
The research paper tabulates all the 15 research 
papers on the basis of following parameters. 
Training set, testing set, recognized set, strokes, 
letter's shape, font style, image style and type, 
image size and category of data. All these 
parameters are categorized under the heading of 
character set as they describe the question and 
specification of data used in research papers. 
Furthermore, the different other parameters are also 
evaluated which are smoothing, chain code 
generation, storing the calculated strings, 
segmentation, image transformation, filters, 
document skew angle removal and recognition 
algorithms. All of these are categorized under the 
heading of Algorithm, as they describe the method 
used for extracting the desired output needed for 
further evaluations. All these parameters with their 
possible values are tabulated in table I (Parameter 
Table). 



87 



http://sites.google.com/site/ijcsis/ 
ISSN 1947-5500 



(IJCSIS) International Journal of Computer Science and Information Security, 
Vol. 13, No. 6, June 2015 



Analysis describes the detail examination of 
different features in a system. Our paper analyses 
the surveyed research paper on the basis of 
different parameters listed above and their division 
is described in the above paragraph. 

The first portion is Data set that describes the 
list of those parameters that form the data set for 
training and testing. The data set involves 
characters and images also. Hence their type, style 
and size are analyzed in this portion and are 
presented in Table II. 

The second portion of parametric analysis 
divides parameters into a set of algorithm having 
different possible values for them in each research 
paper individually. Analysis presents the following 
algorithms to be used extensively Segmentation, 
Image Transformation, Filters and several 
recognition algorithms. These are tabulated in table 
III. 

Obviously, there must be a proper way of 
carrying out a particular task, in term of Urdu 
document's recognition, it is specified as scientific 
procedure and generally categorized as Technique. 
Different research papers use different techniques 
to achieve the desired results, the techniques are 
listed algorithms their respective research papers in 
table II (a). On the basis of strokes, ligature and 
other parts of Urdu language's corpus; the 
recognition is difficult but the analysis shows an 
average precision of 65-70% for the given data sets 
of analyzed research papers. Similarly recognition 
accuracy's average is also difficult but its 
percentage is more than recognition rate, which is 
about 92-93%. 

IV. CONCLUSION 
After the survey on Urdu documents the 
following conclusion is made regarding the Data 
set, Techniques and algorithms, efficiency and 
effectiveness. Widely used Data set that involves 
training set, here is images in which character is 
defined [1], [6], [8] and [11], in which hand written 
text is focused [6], diactrics in [1], [3], [9], [10], 
[11], [12] and [15]. font style Naksh used in [2] and 
[8] and Nastaleeq in [3], [4]. Letter's shape is also 
described in the [8]. In addition image type, style 
and size is also recommended. Overall gray scale 
image is widely utilized to give desired output. 
Segmentation is widely used. [2], [3], [6], [8], [10] 
and [13] defined many types of segmentation. 
Image transformation algorithm is also used along 
with recognition algorithms. 

HMM is extensively used technique that uses 
hidden states to build a system in the given paper, it 
is used in four research papers [6], [10], [11] and 
[15]. 

For the storing of data most widely used 
technique is Rule-based technique. Hence in the 
paper experts only use it for building new styles 
too. This is executed in [4] and [15]. Similarly the 



large focus is on the Matching technique too that 
implements the evaluation of effects of treatment 
by comparing different units. It enhances the 
concept of systems from training set to testing set 
and recognition is hence more authenticated. 

As the spot light is on Urdu documents, the 
importance of Urdu is due to its characters. For that 
reason OCR cannot be neglected which is used to 
convert images of type written and printed text into 
machine encoded text. 

V. FUTURE WORK 
As the techniques discussed in this paper is 
just initial step of implementation but there is still a 
lot of work for future enhancement. A new 
approach by combining HMM or rule based 
approach with template matching technique can 
also be used to tackle all the problems in Urdu 
document recognition. 
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sr drameiers 


Value 


1 


Training Set 


Ligatures, diactrics, partial words, unique words, handwritten text, isolated characters isolated 
words, handwritten Urdu digits, images( isolated digits, numeral strings, special symbols, isolated 
characters, financial Urdu words, different patterns), Named Entities, digits, samples, words, text 

lines 


2 


Recognized 


Ligatures, words, partial words, unique words, Printed, Handwritten, Online, Offline 


3 


Letters' shape 


isolated, initial, final, medial 


4 


Font Style 


Naksh&Nastaleeq 


5 


Image Type, 
Style 


monochrome scanned image, scanned, computer generated, handwritten, grayscale, binary 


6 


Image/Page Size 


512*512(source image), 300-dpi, 150-dpi, 24*24 (template), 1:3 (Aspect ratio) 


7 


Segmentation 


line segmentation(Horizontal, Vertical, Diagonal, Up, Down, Right, Left, Diagonal Right 
Downward, Diagonal Left Downward) & isolated character segmentation, stroke segmentation 


8 


Image 
Transformation 


Grayscale to binary, Pre-processing Step(Sauvola's Method), Color Image to Binary 


9 


Filters 


one dot, two dots, three dots 


10 


Recognition 
Algorithm 


Handwritten Text Recognition Algorithm, Multilingual Cursive Script Character Recognition 
Algorithm, Statistical Named Entity Recognition(N-gram, Unigram, Bigram Algoritm) 



TABLE II (ANALYSIS TABLE: DATASET) 



Ref. No 


Parameters 


Data Set 


Training Set 


Recognized 


Font Style 


Letters' shape 


Image 
Style/Type 


Image/Pa 
ge Size 


S. A. 
Husain 
et.al.[l] 


240 ligatures & 6 diactrics 


864 ligatures 
&50,000 words 


N/A 


isolated, initial, 
final, middle 


N/A 


N/A 


Tabassam 
Nawaz 
et.al.[2] 


N/A 


N/A 


Naksh 


minimum 2, 
maximum 4 


Grayscale 


N/A 


Muhamma 
d Shafi 
et.al.[3] 


215 partial words & 95 
unique words 


8714 partial 
words & 3615 
unique words 


Nastaleeq 


N/A 


N/A 


N/A 


Adnan Ul- 
Hasan 
et.al.[4] 


N/A 


N/A 


Nastaleeq 


N/A 


Grayscale 


N/A 


Gurpreet 

Singh 
Lehal [5] 


N/A 


N/A 


N/A 


N/A 


N/A 


N/A 


PrakashCh 
oudhary 
et.al.[6] 


343 handwritten text, 44 
isolated characters& 57 
isolated words, 


N/A 


N/A 


N/A 


Grayscale 


300-dpi 


Imtiyaz 
Ahmed 
Ansari 
et.al.[7] 


2000 samples 


N/A 


N/A 


N/A 


Color Image 


64*64 


SaeedaNaz 
et.al.[8] 


109,588 images(60,329 
isolated digits, 12,914 


N/A 


Nastaleeq/Naks 
h 


isolated/ joined 
character 


N/A 


N/A 
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numeral strings, 1705 
special symbols, 14,890 
isolated characters, 19,432 
financial urdu words, 318 
different patterns) 












Faryal 
Jahangir 
et.al.[9] 


2313 Named Entities 


N/A 


N/A 


N/A 


N/A 


N/A 


Muhamma 
d Imran 
Razzak 
et.al.[10] 


3000 digits 


N/A 


N/A 


N/A 


N/A 


N/A 


Sobia Tariq 
Javed 
et.al.[ll] 


100 samples, 18600 words, 

1 fSQ9 1itTatiirp<s 


1569 ligatures 


N/A 


isolated, initial, 

final miHrllp 


monochrome 
d Scanned 
images 


150-dpi 


Khalil 
Khan 
et.al.[12] 




N/A 


N/A 


N/A 


scanned, 
handwritten, 
computer 
generated 

image, 
grayscale 


512*512( 
source 
image), 

42*24(te 
mplate 
image) 


Muhamma 
d Imran 
Razzak 
et.al.[13] 


N/A 


N/A 


N/A 


N/A 






Faisal 
Shafait 
et.al.[14] 


234,286,702,1 158,819(Text 
-lines) 


N/A 


N/A 


N/A 


Binary 
image 


l:3(aspect 
ratio) 


SaeedaNaz 
et.al.[15] 


N/A 


N/A 


N/A 


N/A 


N/A 


N/A 



TABLE III (ANALYSIS TABLE: ALGORITHMS) 



Ref. No 


Parameters 


Algorithms 


Segmentation 


Image 
Transformation 


Filters 


Recognition Algorithms 


S. A. 
Husain 
et.al.[l] 


N/A 


N/A 


N/A 


N/A 


Tabassam 
Nawaz 
et.al.[2] 


line segmentation & isolated character 
segmentation 


Grayscale to 
binary 


one dot, 
two dots, 
three dots 


N/A 


Muhamma 
d Shafi 
et.al.[3] 


N/A 


N/A 


N/A 


N/A 


Adnan Ul- 
Hasan 
et.al.[4] 


N/A 


Pre-processing 
Step(Sauvola's 
Method) 


N/A 


N/A 


Gurpreet 

Singh 
Lehal [5] 


Reduces classes to 2328(2190 primary 
ligatures, 22 secondary ligatures, 41 
primary isolated characters & 95 touching 
components 


N/A 


N/A 


N/A 


PrakashCh 
oudhary 
et.al.[6] 


line & isolated chharacter segmentation 


N/A 


N/A 


Handwritten Text 
Recognition Algorithm 


Imtiyaz 
Ahmed 
Ansari 
et.al.[7] 


N/A 


Colored to Binary 


Median 
Filter 


N/A 


SaeedaNaz 
et.al.[8] 


line(horizontal, vertical, diagonal) & 
isolated chharacter segmentation 


N/A 


N/A 


Multilingual cursive script 
character recognition 
algorithm 


Faryal 
Jahangir 
et.al.[9] 


N/A 


N/A 


N/A 


Statistical Named Entity 
Recognition(N-gram, 
Unigram, Bigram Algorithm) 


Muhamma 


Up, Down, Right, Left, Diagonal Right 


N/A 


N/A 


N/A 
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d Imran 
Razzak 
et.al.[10] 


Downward, Diagonal Left Downward 








Sobia Tariq 
Javed 
et.al.[ll] 


N/A 


N/A 


N/A 


N/A 


Khalil 
Khan 
et.al.[12] 


N/A 


N/A 


Median 
Filter 


N/A 


Muhamma 
d Imran 
Razzak 
et.al.[13] 


stroke segmentation 


N/A 


N/A 


N/A 


Faisal 
Shafait 
et.al.[14] 


N/A 


N/A 


N/A 


N/A 


SaeedaNaz 
et.al.[15] 


N/A 


N/A 


N/A 


Name Entity Recognition 
Algorithm 



TABLE IV (TECHNIQUES & RESULTS OF ANALYSIS) 



Ref. No 


Technique 


Results 


S. A. Husain 
et.al.[l] 


STNN, OLUCR, Tree based 






Dictionary Search, Intuous Wacom 
Board 


Recognition Rate/ Precision 


Recognition accuracy 


Tabassam 
Nawaz 
et.al.[2] 


Special Matching Technique, 
Morphology Technique(Pepper 
noise removal procees, thinning 
process) 


93% for Base strokes & 98 % for 
secondary strokes 


N/A 


Muhammad 
Shafiet.al.[3] 


Database retrieval 


15 char/ sec 


89% 


Adnan Ul- 
Hasan 
et.al.[4] 


Machine Learning, Rule Based 
Technique 


N/A 


N/A 


Gurpreet 
Singh Lehal 


OCR 


69% 


88% 


[5] 








PrakashChou 


HMM, Annotation, Word-level, 


N/A 


99% 


dhary et.al.[6] 


complete level 


Imtiyaz 
Ahmed 
Ansari 


BPNN 


N/A 


N/A 


et.al.[7] 








SaeedaNaz 
et.al.[8] 


Neural Network 


N/A 


92.07% 


Faryal 
Jahangir 
et.al.[9] 


Smoothing Technique, bigram 
NER tagger 


N/A 


91.54% for Naksh& 94.5% for 
Nastaleeq 


Muhammad 
Imran Razzak 
et.al.[10] 


Fuzzy linguistic, HMM, Hybrid 
Approach 


65.21 % precision for n-gram, 
66.2 % for bigram 




Sobia Tariq 
Javed 
et.al.[ll] 


HMM, Jang Chin Algorithm 


N/A 


98.60 % digits, 98.49 % 
uppercase letters, 
97.44%lowercase letters, 97.40% 
combined set 


Khalil Khan 
et.al.[12] 


Template Matching 


N/A 


92.73% 


Muhammad 
Imran Razzak 
et.al.[13] 


Novel technique, Bresenham's line 
algorithm 


100% for 5 character ligatures, 
87% for 3 character ligatures, 
78% for 2 character ligatures 


N/A 


Faisal Shafait 
et.al.[14] 


Layout Analysis 


N/A 


N/A 


SaeedaNaz 


HMM, ME, CRF, SVM, ML 


N/A 


N/A 


et.al.[15] 


appraoches, rule based approach 
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Abstract: Mapping the virtual machines to the physical 
machines cluster is called the VM placement. Placing the VM 
in the appropriate host is necessary for ensuring the effective 
resource utilization and minimizing the datacenter cost as well 
as power. Here we present an efficient hybrid genetic based 
host load aware algorithm for scheduling and optimization of 
virtual machines in a cluster of Physical hosts. We developed 
the algorithm based on two different methods, first initial VM 
packing is done by checking the load of the physical host and 
the user constraints of the VMs. Second optimization of placed 
VMs is done by using a hybrid genetic algorithm based on 
fitness function. Our simulation results show that the proposed 
algorithm outperforms existing methods and enhances the rate 
of resource utilization through accommodating more number 
of virtual machines in a physical host 

Index Terms: Virtual Machine, Physical Machine Cluster, 
VM Scheduling, Load Rebalancing, Load Monitoring. 

I. Introduction 
Infrastructure-as-a-Service (IaaS) is the most fundamental 
use of cloud computing. The virtualization technology is the 
base to form an IaaS platform. This proposes the entire 
computing resources for deploying and executing 
applications, storing data, or accommodating a company's 
complete computing environment [3]. Virtualization 
technologies guarantee opportunities for cloud data centers 
to host applications on shared infrastructure. Data center 
expenses can be lessened by using virtual machines (VMs) 
Cloud data center providers can create a huge number of 
virtual machines (VMs) for different types of workload and 
specification requirements. [4] Each VM is configured with a 



certain amount of computing resources which is adequate 
with workload requirements. The cloud service providers 
can consolidate all the VMs into a few numbers of physical 
hosts, keeping in mind the end goal to lessen the aggregate 
number of obliged physical servers and abusing server 
capacities all the more completely, permitting cloud 
providers to spare cash on equipment and vitality costs. VM 
consolidation method is the key sympathy toward attaining 
economy of scale in a cloud data center domain [5]. 
The advent of virtualization technology enables the physical 
server consolidation in datacenters which plays a vital role 
in minimizing the number of physical servers used and 
energy consumption also. Various approaches has been 
provided by the researchers for server consolidation in data 
centers but none of them have been considered all the 
aspects of the server consolidation which ensures the QOS 
as well as reduced cost for the datacenter administrators. 
Therefore a new algorithm is needed in order to provide 
better service to the cloud users and at the same time 
reducing the operational cost to the service provider. Placing 
the VM in the appropriate host is necessary for ensuring the 
effective resource utilization and minimizing the datacenter 
cost as well as power. To address this problem in this paper 
we propose a new efficient hybrid genetic based host load 
aware algorithm for scheduling and optimization of virtual 
machines in a cluster of Physical hosts. We divide this 
problem into two following categories. 
A. Initial Scheduling of VMs 

The Virtual Machine allocation problem in a cloud 
infrastructure is investigated by many researchers in the 
past. But the majority of the presented mechanisms paid no 
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attention to the ever changing load of the physical host and 
dynamic nature of the Virtual Machine deployment requests 
that frequently reaches the cloud provider infrastructure. 
Here we present an efficient hybrid host load aware 
algorithm for scheduling virtual machines to a cluster of 
Physical hosts. We developed the algorithm based on two 
different methods, first by checking the load of the physical 
host, the load factor of a physical host can be measured by 
the way of analyzing utilization level of the individual 
resources like CPU, Memory and Network bandwidth. 
Second by considering the past utilization activities of a VM 
to a physical host. 

B. Ongoing Load Rebalancing or Optimization 

Rebalancing of load in datacenter environment need live 
VM migrations but more number of frequently moved VMs 
between physical hosts causes increased network bandwidth 
utilization and datacenter cost hence the load rebalancing 
has to be achieved with minimum number of VM migrations 
in order to solve this issue we used a modified version of 
hybrid genetic algorithm for load optimization. The main 
contribution of this paper includes the introduction of 
virtualization technology, a new proposed algorithm for 
initial VM scheduling, ongoing load rebalancing or 
optimization and validation of the proposed algorithm on a 
simulated environment for its goals. 

The rest of the paper is organized as follows: In Section II 
we describe the related work while in Section III placement 
problem under study has been explained, we present the 
design model to explain the proposed strategy in section IV 
The proposed algorithm for VM scheduling is discussed in 
section V. Load balancing and VM optimization based on 
genetic algorithm is presented in section VI. Section VII 
shows the experimental setup and results acquired by our 
technique compared with some of the existing strategy for 
optimal VM placement and optimization. Section VI 
concludes the paper and spotlights some possible future 
directions. 

II. Related work 

Most of the IaaS cloud data centers uses virtualization 
technology since it provides a good flexibility in the 
provisioning and placement of servers and their associated 
workloads and cost savings [6] [7] while this model 
provides a number of advantages, it is essential to administer 
the allocation of virtual machines to the physical hosts in the 
data center. Even though a lot of researchers have been 
studied this virtual machine mapping problem in the past we 
draw attention to some of the closest work in perspective of 
our point. 

In [8] the number of physical machines needed to deploy the 
requested virtual machine instances are reduced by 



combining time series forecasting techniques and bin 
packing heuristic but the model has not included the 
relationships between multiple resources, like CPU and I/O. 
In [9] the VM placement algorithms make use of the 
behavior of VMs to have some properties in general. In [10] 
for the placement of virtual machines to physical machines a 
two level control management system is used and it uses 
combinatory and multi-phase efficiency to solve potentially 
inconsistent scheduling constraints. In [11], VM scheduling 
constraints are considered as single dimension in a 
multidimensional Knapsack problem. 

In [12], the VM scheduling policy is primarily dealt out 
from the viewpoint of network traffic and three common 
scheduling algorithms have been introduced for Cloud 
computing and simulation results provided. In [13] the 
performing load balancing in data centers are intensively 
studied the heuristics has been used as a common approach 
among systems to enables the load balancing among 
physical servers. In [14] the performance variations have 
been identified and monitored in a physical server hosting 
VMs. A few simple VM placement algorithms like time- 
shared and space-shared were presented and compared in 
[15] and introduced a method to model and simulate Cloud 
computing environments, in which the algorithms can be 
implemented. In [16] pioneered methods for virtual machine 
migration and proposed some migration techniques and 
algorithms. [17] Evaluated most important load-balance 
scheduling algorithms for conventional Web servers. 
VectorDot a novel load-balancing algorithm has been 
introduced in [18] to work with structured and multi- 
dimensional resources limitations by taking servers and 
storage of a Cloud into account. A countable measure of 
load imbalance on virtualized data center servers has been 
proposed in [19]. In [20] a comparative study of widely used 
VM placement strategies and algorithms for Cloud data 
centers has been presented. An overloaded resource based 
VM placement approach has been presented in [21]. In our 
previous study [22] the comparison of various VM 
scheduling algorithm has been presented and demonstrated 
the necessity of new efficient placement VM placement 
algorithm. 

A genetic based simulated annealing algorithm for 
optimization of task scheduling in cloud computing has been 
proposed and implemented in [23]. This algorithm only 
considers the QOS necessities of various types of tasks. 
Some of the genetic operators that use the group-oriented 
structure lead the better results when compared to the non- 
grouping genetic based algorithms which are not use such 
grouping feature. In [24] [25] they used the grouping based 
genetic algorithm to reach better results than conventional 
methods and universal heuristic algorithms. 
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III. Problem Formulation 

The major principle of the IaaS cloud computing system is 
that its user can make use of the resources to have good 
performance and economic benefits. With the support of 
virtualization innovation the resources can be conveyed to 
the users in the form of virtual machines hence an efficient 
virtual machine allocation policy and management process 
is required to avoid underutilization or overutilization of the 
physical machines which may affect the quality of services 
of the IaaS cloud. The under utilization of servers is a well 
known expenditure concern in cloud management. Low 
utilization of server resources leads to the usage of more 
physical machines, increasing expenses for machine power 
and capital and operational expenses for cooling systems. 
Moreover, surplus machines require more carbon footprint. 



The overutilization of physical servers results in violating 
the SLA and quality of service constraints. Efficient 
allocation of Virtual machine instance request will meet 
client requirements, improve the resource utilization, 
increases the overall performance of the cloud computing 
environment and also decreases the number physical 
machines used. Therefore an efficient VM scheduling and 
ongoing load monitoring and optimization in IaaS is an 
important cloud computing problem to resolve. 

IV. Description of design model 
To address the VM scheduling and ongoing load 
optimization problem we have proposed a multi dimensional 
physical host load aware scheduling and hybrid genetic 
based optimization algorithm and we implemented this 
heuristics in JAVA using Netbeans IDE. 
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Figure 1 : Framework model for VM placement in a cluster of physical machines 



The above figure shows the framework model in which the 
proposed algorithm is implemented. Here the physical 
clusters can be formed by adding a set of physical servers 
each server contributing its own share of resources such as 
CPU cores, main memory, disk capacity and network 
bandwidth. The users can create virtual machine instances 
by giving their requirements for running the applications and 
the VM requests are submitted by the users to the computing 
system. As the submitted VMs enter to the cloud they are 
wait for their turn in the stack. The VM requests can be 
handled by the virtual machine scheduler and it finds the 
appropriate physical machine by estimating the VM size and 
checking for the availability and capacity of the physical 
machine when it finds the appropriate physical machine the 
VM scheduler immediately allocates the identified physical 
machine to the virtual machine instance request in queue 
and the required resource can be allocated to the virtual 
machine. Rebalancing of load in this environment is handled 



by virtual machine optimizer we used a modified version of 
genetic algorithm for load optimization. 

V. Algorithm design for the process of virtual 

MACHINE ALLOCATION 

This is a simple and efficient method that uses the load 
factor of the physical machine and also VM constraints 
given by the user about the VM resource requirement. It also 
identifies the overloaded physical machine and selects the 
VM to migrate based on the past behavior of the VM and 
picks the appropriate PM based on its resource utilization 
rate. Then it discovers the underutilized PMs and migrates 
the VMs running on it to some other suitable PMs, and turn 
it off in view of energy saving. Since accurately forecasting 
the resource requirement and behavior of the VM is not 
possible our algorithm utilizes the user deployed resource 
details of workload of the VM and considers the load factor 
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of the physical machine as well as physical machine cluster 
to identify the appropriate PM for the given VM request. We 
use bin packing heuristic combined with three different 
algorithms to minimize the number of Physical machines 
required to place a set of VMs, quick and correct placement 
of VMs , maintain balanced load among the servers, 
increase the resource utilization rate and importantly doing 
all these things without violating any SLA agreements. 
N number of virtual machines with resource requirements 
VR (CPU, Memory, N/W Bandwidth) to be placed on a set 
of M physical machines with resource capacities of 
PR(CPU, Memory, N/W Bandwidth) grouped in K number 
of physical machine cluster. 

Consider PM as a set of all the physical machines in the 
entire system, where PM = {PM b PM 2 , PM 3 ... PM m }. m 
is total number of the physical machines and an individual 
physical machine can be denoted as PM i, where i denote the 
physical machine number and range of i is (1 <= i <= m). 
Similarly, the set of VMs on the physical machine i, can be 
{VMil, VMi2.. . .VMin} here n is the number of VMs on the 
physical server i. If we want to deploy VM j on the PMi then 
the load of the CPU, RAM and bandwidth has to be 
calculated individually. The CPU load of the PMi at the time 
interval ts is denoted as follows 

n 

PMi(cpu, ts) =^VMij(cpu,ts) (1) 

H 

The amount of RAM utilized by all the VMs of PMi at the 
time interval ts can be denoted as follows, 

n 

PMi(ram ; ts) = ^VMij(ram ; ts) (2) 

i=i 

The amount of Network Bandwidth utilized by all the VMs 
of PMi at the time interval ts can be denoted as follows 

n 

PMi(nbw,ts) =^VMij(nbw,ts) (3) 

H 

Where PMi represents the i th physical machine of the 
Physical Machine Cluster k, VMij represents j th virtual 
machine of the PMi and cpu, ram and nbw denotes the 
amount of CPU, RAM and Network Bandwidth utilized by 
all the VMs of the PMi respectively. 

Hence derived from (1),(2) and (3) the weighted average 

load of the Physical Machine Cluster k 

at time interval ts can be denoted as follows 



m 

PMCk(WL,ts) = ^ PMi (WL ,ts) (4) 

i=l 

Where PMCk represents the k th physical machine cluster of 
the datacenter,WL represents the weighted load of physical 
machine cluster at time interval ts and PMi represents the i th 
physical machine of the Physical Machine Cluster k 
At any time interval the total VM load of a PM should not 
exceed the host capacity 

X PMi W resourceusage (ts) < TH value < £ PMi W resource 

capacity 

resource resource (5) 

Where resource € {CPU, RAM, Network Bandwidth} and 
W res0U rce is the weight associated with each resource TH 
value is the threshold value set by the administrator if the 
load goes beyond this value the host can be considered as 
overloaded host and the selected VMs has to be migrated to 
other appropriate physical machines. 

VI. Dynamic vm placement 
In this process the objective is to place the VMs in PMs in a 
way that the total number of PMs required to place all the 
VMs is decreased. So we considered this a multi potential 
bin packing problem since this is a NP-hard problem, we 
provide a heuristic based on multiple policy. In the earlier 
stages of allocation most of the PMs are underutilized or not 
used so our heuristics works as like the first fit scheduler 
which is a simplest one to implement and which increases 
the response time of VM placement. As the number of VM 
grows in the datacenter the utilization level of PM is also 
being considered by our heuristic which really helps in 
maintaining the balanced load among servers. Towards the 
closing stages the heuristic works according to the nature of 
the VMs workload that is gathered from the user provided 
hints which helps in avoiding the bottleneck of a particular 
resource as well as avoiding the violence of any SLA 
agreements. The algorithm which is used to achieve these 
things is given below. 
Algorithm 1: Dynamic VM placement 
Stepl:- The VM requests given by the user at the time ti is 
considered for allocation and scans the values of number of 
CPU cores, amount of RAM and amount of N/W bandwidth 
required. 

Step2: In this algorithm the scheduler maintains an index 
table for physical clusters and physical machines as well as 
their states whether available or busy. 
Step 3: The scheduler scans the index table of the physical 
cluster for the load below 50 %, from top until the first 
available physical cluster is found or the index table is 
scanned fully. 

Step 4: If the physical cluster is found then scan the index 
table of physical machines for the load below 50 % in all 
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three major resources, from the top until the first physical 
machine is found. 

Step 5: When found return the ID of the physical machine to 
the main controller 

Step 6: Assign the VM to the identified PM. 

Step 7: Update the index table of the PM and Physical 

cluster. 

Step 9: Go to the step 1 

Step 8: If not found then scheduler scans the index table of 
the physical cluster for the load below 70 %, from top until 
the first available physical cluster is found or the index table 
is scanned fully. 

Step 9: If the physical cluster is found scan the index table 
of the PMs based on the requirements of the requested VM. 
Step 10: If the requested VM is a CPU intensive then scan 
the PM index table for the amount of CPU utilized is below 
70 %, from the top until the first physical machine is found. 
Step 1 1 : When found return the ID of the physical machine 
to the main controller 

Step 12: Assign the VM to the identified PM. 

Step 13: Update the index table of the PM and Physical 

cluster and go to the step 1 

Step 14: If the requested VM is a memory intensive then 
scan the PM index table for the amount of RAM utilized is 
below 70%, from the top until the first physical machine is 
found. 

Step 15: When found return the ID of the physical machine 
to the main controller 

Step 16: Assign the VM to the identified PM. 

Step 17: Update the index table of the PM and Physical 

cluster and go to the step 1 

Step 18: If the requested VM is a network intensive then 
scan the PM index table for the amount of network 
bandwidth utilized is below 70%, from the top until the first 
physical machine is found. 

Step 19: When found return the ID of the physical machine 
to the main controller 

Step 20: Assign the VM to the identified PM. 

Step 21: Update the index table of the PM and Physical 

cluster and go to the step 1 

Step 22: If Physical Cluster is not found. The scheduler 
scans the index table for the load below 80 %, from top until 
the first available physical cluster is found or the index table 
is scanned fully 

Step 23: If found scan the index table of the PMs based on 

the requirement of the requested VM. 

Step 24: If the requested VM is a CPU intensive then scan 

the PM index table for the least number of CPU cores 

utilized from the top until the first physical machine is 

found. 

Step 25: If found check the host has enough CPU cores to 
fulfill the VMs CPU requirement and will not surpass 90% 



of load after placing the new VM, then return the ID of the 

physical machine to the main controller. 

Step 26: Assign the VM to the identified PM. 

Step 27: Update the index table of the PM and Physical 

cluster and go to the step 1 . 

Step 28: Else go to step 22 

Step 29: If the requested VM is a memory intensive then 
scan the PM index table for the least amount of RAM 
utilized from the top until the first physical machine is 
found. 

Step 30: If host has enough RAM to fulfill the VMs memory 
requirement and will not surpass 90% of load after placing 
the new VM, then return the ID of the physical machine to 
the main controller. 

Step 3 1 : Assign the VM to the identified PM. 

Step 32: Update the index table of the PM and Physical 

cluster and go to the step 1 . 

Step 33: Else go to step 22 

Step 34: If the requested VM is a network intensive then 
scan the PM index table for the least amount of network 
bandwidth utilized from the top until the first physical 
machine is found. 

Step 35: If host has enough bandwidth to fulfill the VMs 

bandwidth requirement and will not surpass 90% of load 

after placing the new VM, then return the ID of the physical 

machine to the main controller. 

Step 36: Assign the VM to the identified PM. 

Step 37: Update the index table of the PM and Physical 

cluster and go to the step 1 . 

Step 38: Else go to step 22 

VII. Load balancing among physical servers 

Since virtual machine workloads frequently change 
eventually, the well primary placement choices is not 
sufficient to maintain the balanced load. So it is essential to 
dynamically rework placements to make QOS constraints 
are to be satisfied while change in the data center load. 
Maintaining balanced load among server requires more 
number of VM migrations which leads to increase the 
operational cost of the service provider so VMs should be 
rearranged in a way such that the number of VM migrations 
should be minimized while satisfying resource utilization 
and load balance. In this type of multifaceted problems, 
even the most prominent algorithms can't realize all the 
associations between VMs, physical servers, and physical 
clusters to lead the most finely optimized solution. In order 
to achieve this goal a new grouping based genetic algorithm 
is proposed and we believe that our new algorithm is useful 
for this kind of complex optimization problem. 
A. Grouping Genetic Based Algorithm Design for Load 
Balancing among Physical Servers 
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Genetic algorithm is a better searching technique for VMs 
mapping problem because of its enhanced optimization 
ability and parallelism advantages to solve complex 
problems. 

The common steps of the Genetic algorithm are summarized 
as follows: 

• Creation of an initial population 

• The below steps repeated until it reaches the 
stopping condition 

• Select chromosome pairs for mating 

• perform cross-over to generate new offsprings 

• Calculate the fitness value of new offsprings 

• Create a new population 

B. Creation of an Initial Population 

Genetic algorithm is executed in parallel on a set of selected 
physical servers. So creating Initial populations plays an 
important role [26] in genetic algorithm so we develop a 
novel algorithm to generate initial population. In solution 
space for these physical hosts Selection process chooses the 
solution vectors according to the probability which is 
proportional to the fitness value. Then the algorithm crosses 
the chosen product vectors and performs mutation operation 
on the crossed product vectors based on the fitness value. 
The algorithm continues the same stage until it reaches out 
the terminating situation, followed by the crossover and 
mutation process. 

Steps for selecting initial Population 

Step 1 : Check the PM load against threshold value. 
Step 2: If any PM resource utilization surpasses the 
threshold value that can be considered as an overloaded host 
Step 3: Select the overloaded servers and sort those PMs 
based on their resource utilization value. 

C. Fitness Function 

The fitness value plays an important role in any individuals 
output. It is the evaluation methodology of the dominance of 
an individual in the population. The performance of an 
individual can be determined by its fitness value. The 
performance of an individual can be considered as better 
when the fitness value is high. The existence or termination 
of an individual is completely based on the fitness value. 
Therefore, the fitness function is an essential part of the 
Genetic Algorithm. The objective function can be defined as 
follows when there is m host in the physical cluster k and m 
is the number of VM in each host. 



PMi{Rcyu,ts) = PMi{Tcj)u,ts) - ^VMij{Dcpu t ts) 



(6) 



Where PMi(Rcpu,ts) represents the remaining CPU of i 
PM at the time slot ts ,T cpu represents the total CPU 
capacity of i th PM and VMij(Dcpu,ts) represents the 



demanded CPU of the j th VM of the i th Physical host at the 
time slot ts. 



ram,ts) 



(?) 



Where PMi(Rram,ts) represents the remaining RAM of i 
PM at the time slot ts ,Tram represents the total RAM 
capacity of i th PM and VMij(Dram,ts) represents the 
demanded RAM of the j th VM of the i th Physical host at the 
time slot ts. 



in 

PUi{Rnbw,ts) = PUi{hbw,ts)- JvUij(dnb%ts) 



Where PMi(Rnbw,ts) represents the remaining Network 
Bandwidth of i th PM at the time slot ts, Tnbw represents the 
total Network Bandwidth capacity of i th PM and 
VMij(Dnbw,ts) represents the demanded Network 
Bandwidth of the j th VM of the i th Physical host at the time 
slot ts. 



PMCkjiRcpu=^ 

x=i 



URcipu 



PMCkjiRmm= ^ 

i=i 



PMiRram 



in 

PMCkjiRnbw=^ 

x=i 



PMiRnbw 



(9) 



(10) 



(11) 



Where PMCk uRcpu , PMCk uRram and PMCk juRnbw 
represents the k th physical cluster's mean value of CPU, 
RAM and Network Bandwidth respectively. 

In our proposed algorithm we consider four objectives in 
packing and optimizing the virtual machines in a data 
center: minimizing the total revenues, reducing the power 
consumption cost, reducing the cost of migration, increasing 
the total revenues and also reducing the SLA violation rate. 
These diverse objectives can be accomplished by evaluating 
the following fitness function described in equation 12 while 
allocating the VMs 
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n \ / n 

-V (PMiRcpu - PMCk jiR cpu) 2 + -V (PMiR ram - PMCk jiR ram 
\Li , NZj 

i=l / \N i=l 



^ 11 

-/(PMiRnbw -PMCkjiRnbwf 

^ NZj 

\ i=l 



(12) 



The objective function of our algorithm wants to minimize 
the standard deviation of the remaining CPU, RAM and 
Network Bandwidth in each host. As we consider that the 
load of the entire physical cluster instead of taking into 
consideration of the total number of virtual machines in each 
physical host as a load balance metric we developed an 
objective function that tries to balance the consumption of 
CPU, RAM and Network Bandwidth on each host, in view 
of a heterogeneous environment, which consists of different 
hosts with different configurations. 

D. Crossover Operator 

Genetic algorithms crossover operator used to combine the 
qualities of different individuals in the population with the 
intention of creating a new generation. Hypothetically the 
new child will have good qualities from both parents and 
optimistically has better fitness. Any two parents have been 
chosen with probability relative to the fitness of the 
individual. Most of the times, the individuals with high 
fitness value will reproduce with higher probability than the 
individuals with lower fitness value.We followed a method 
which is similar to the one illustrated in [27] for the 
implementation process of the crossover operator. In our 
methodology all of the servers from both parents are 
integrated and the servers are sorted based on the fitness. 
The servers with less remaining capacity of all the 
individual resources are at the front of the list, whereas the 
servers with more remaining capacity are placed at the end 
of the list. Then our algorithm analytically chooses the 
servers which has less remaining capacity and remains them 
together in the same group. During this process whenever a 
selected server contains any VM that belongs to a server that 
has been chosen previously, then that server is a superfluous 
and can be removed in order to avoid duplication. But this 
process will create a list of servers that may not include all 
VMs. These VMs which are outstanding that have not been 
integrated in any server will be used to reinserted in to other 
servers based on the algorithm 1 . 

E. Mutation Process 

Mutation operator in our algorithm comprises three 
alternatives. First, choice of mutation process removes the 
VMs of randomly selected servers and the removed VMs 
consequently reinserted into the other servers which are in 



the new population based on our algorithm 1. Second, two 
randomly chosen VMs of existing packing order are 
interchanged between servers. In this process we assure that 
the algorithm never interchanges two VMs that came from 
the same server. As a third option, one VM is shifted to a 
different server to generate a new packing order. 
Based on the information provided by the monitoring driver 
the second and third genetic operator works on the packing 
order list, to increase the performance of the ordering 
genetic process. Finally, for all the above genetic operators 
the mutation process is done on the VMs with probability 
inversely proportional to the fitness value of the server that 
the VMs originally come from. VMs placed in servers with 
lesser fitness value are mutated more frequently than VMs 
placed in servers with higher fitness value, in order to 
guarantee that the organization of enhanced server is 
retained.Presently new children will be an element of the 
next generation so we need to choose one solution from the 
next generation of solution. Whenever the exit criteria are 
satisfied then this algorithm is stopped and returns servers 
which has the highest fitness evaluation value. 

Table I: Properties required for the index table of physical 
machine and physical machine cluster 



S.No 


Physical Machine 


Physical Machine 
Cluster 


1 


Total number of VMs 
placed 


Total number of PMs 


2 


Total number of VMs in 
each type (CPU 
intensive, RAM 
intensive, N/W intensive) 


Total number of PMs 
exhausted 


3 


The percentage of load of 
the PM in each resource 
type individually 


The cumulative 
percentage of the load 
of the entire PMs 


4 


Total number of CPU 
cores utilized and 
available 


The list of PMs 
which can be used to 
place the CPU 
intensive VMs 


5 


Total amount of RAM 
utilized and available 


The list of PMs 
which can be used to 
place the memory 
intensive VMs 


6 


Amount of n/w 
bandwidth utilized and 
available 


The list of PMs 
which can be used to 
place the N/W 
Bandwidth intensive 
VMs 
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VIII. Performance evaluation 

A. Experimental Setup 

The presented algorithm is implemented in JAVA Net beans 
IDE. Then we use CloudSim simulator for simulation to 
assess the execution and performance of our heuristics with 
some of the existing scheduling algorithm in terms of 
Response Time, Load Balancing among servers, Reasonable 
Resource Utilization, energy consumption, Minimum 
number of active PMs and Higher profit by reducing the 
number of migrations. The performances of the proposed 
algorithm were examined from both users and service 
provider's perception. 

Since it is difficult to access the real datacenters or cloud 
infrastructures we used simulation based evaluation which 
can be easily reproducible to compare the performance of 
the proposed algorithm with the following existing works 
which is currently used by the majority of the cloud service 
providers: 1) First Fit Algorithm 2) Round Robin 
Scheduling Algorithm 3) Best Fit Algorithm. The simulated 
cloud environment contains a cluster of heterogeneous PMs 
the total resource capacity of PMs is expressed in percentage 
and randomly generated VM resource demand includes the 
number of CPU cores, amount of RAM and required 
network bandwidth. 

B. Analysis 

The investigations are done to analyze the effect our 
proposed algorithm in number of physical servers required 
to place a certain number of VMs, overall resource 
utilization rate of all the active servers, allocation time, load 
balancing, percentage of migration and percentage of SLA 
violations. The simulation results show that our proposed 
algorithm can use the less number of physical servers for 
placing a certain number of VMs which helps to improve the 
resource utilization rate. The response time of our algorithm 
is little bit more than the first fit algorithm because of its 
nature of allocating VMs is based on the user constraints and 
past usage history of the VMs. Higher SLA satisfaction rate 
and lower load imbalance rate can be observed in results 
which also show that our multi dimensional host load aware 
and user constraints based algorithm is applicable, valuable 
and reliable for implementation in real virtualized 
environments. 

Rebalancing of load in datacenter environment need live 
VM migrations but more number of frequently moved VMs 
between physical hosts causes increased datacenter cost 
hence the load rebalancing has to be achieved with 
minimum number of VM migrations in order to solve this 
issue we used a modified version of genetic algorithm for 
load optimization. Our results show that the percentage of 
VM migrations had been decreased through which we can 



achieve the better results for load balancing along with cost 
reduction. 

In the following figures, Fig 2 shows the number of physical 
servers utilized by the scheduler to place the set of VM 
request without violating any SLA. Here our proposed host 
load aware user hint based algorithm and first fit algorithm 
uses comparatively same number of physical hosts for 
placing the set of VMs. The number of servers used by the 
proposed algorithm is minimized when compared to the 
round robin and best fit algorithm. 



130 
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Fig 2: Comparision of the number of Physical Servers 



Though the numbers of servers used by the first fit and 
proposed algorithms are comparatively stable from figure 3 
we can see that the resource utilization rate of our algorithm 
is appreciably outperforms the other three algorithms. 
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Fig 3: Comparision of the overall resource utilization rate 



Fig 4 shows that the response time of all the algorithms are 
comparatively stable our algorithm takes little bit more time 
to allocate VMs than the first fit algorithm because of its 
nature of allocating VMs based on the user provided 
information and past usage history of the VMs 
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to a new host we achieved the better resource utilization 
benefit and balanced load among the physical hosts. 
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Fig 6: Comparision of the Percentage of VM Migarations 
for Load Balancing 

From the below Fig 7 the low SLA violation rate is observed 
in the proposed algorithm because it uses the past behavior 
of the VM along with the user provided information and it 
maps the PM by considering the availability of the each key 
resource like CPU, RAM and network bandwidth 
individually. 

20 n 
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Fig 7: Comparision of the Percentage of VMs that violate 
their SLA 

IX. CONCLUSION AND FUTURE WORK 

We presented our novel algorithm that considers user 
constraints of VM along with physical host load factor to 
address the problem of mapping the VMs into PMs such that 
the number physical host used is minimized, the 
overutilization and underutilization of the resources of a host 
can be identified and resolved at the same time without 
violating any SLA agreements. Since we consider this as a 
multi potential bin packing problem we combined three 
different heuristics which considers load factor of hosts 
along with user provided information at the various stages of 
placing the VMs in physical hosts. Based on our analysis we 



Fig 4: Comparision of the ResponseTime of different 
algorithms 

The analysis extremely examines the effect of load 
balancing by using the algorithm and the number of 
migration needed to achieve the load balanced environment 
subsequent to scheduling. 

Fig 5 shows the percentage of load imbalance value in 
which our algorithm demonstrates that it gets better the way 
to obtain the load balancing of the data center than the three 
other approaches when the number of VMs to deploy is 
increased. 
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Fig 5: Comparision of the percentage of Load Imbalance 
Value 

Our proposed algorithm is effective in improving the 
resource utilization rate and load balancing with the help of 
live migrations. But one of our major aims is increasing the 
total revenue which requires cutting down the VM migration 
cost which can be achieved by reducing the percentage of 
VM migration rate. We use migration rate as the estimation 
metric which is defined as the percentage of the migrated 
VMs to the total number of VM instances. We showed the 
results in the following Fig. 6. The proposed algorithm 
decreases the migrating rate from about 18%-20% to less 
than 13 % which leads to reduce the VM migration cost. 
Though the curve of our proposed algorithm indicates that 
only less number of VMs migrated from their original host 
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showed that our proposed algorithm utilizes minimum 
number of physical servers for hosting the set of VMs, 
which also reduces the energy consumption of the datacenter 
and it achieved high resource utilization rate by the way of 
using minimal number of physical servers. Another 
considerable enhancement in our algorithm is less 
percentage of load imbalance value and the percentage of 
VMs that violate their SLA. 

As our future work we planned to incorporate the proposed 
algorithm with an open source cloud platform and test its 
efficiency against real time environment and also we would 
like to Modeling the interconnection prerequisites that can 
correctly express the relationships between VMs 
consolidated in the same host which will be valuable for 
additional optimizations of VM scheduling in cloud 
infrastructure. 
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ABSTRACT 

Due to the need for strong security for customer financial 
information in the banking sector, the sector has started 
the introduction of biometric fingerprint measures in 
providing securities for banking systems and software. In 
this paper, we have carefully explained the methodology of 
using this technology in banking sectors for customer 
verification and authentication. The challenges and 
opportunities associated with this technology were also 
discussed in this paper. 

KEYWORDS: Security, Biometric, Fingerprint, Bank 
INTRODUCTION 
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Information technology has received a lot of advancement 
over the years, thus encouraging more improvement in 
information security. In order to improved security 
measures in many data-driven applications, 
authentication like biometric plays important roles [1]. 
Security is the state of being secure. In other words is 
building protection against advances. Since computers 
form the major tools used in processing data and 
manipulating information in many sector (e.g. banking 
sector), there is need to have adequate security for these 
computers. Meanwhile [2], define computer security as the 
need to secure physical location, hardware and computer 
software from outside threats. There exist multiple layers 
of computer security namely- physical security, personal 
security, operational security, communication security, 
network security and information security [2]. 
All these layers of computer security have received series 
of researchers' attentions since the information age and a 
lot of improvement has been recorded on them. 
Meanwhile, the layer of information or software security 
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still needs a lot of attention as well as other layers. It is 
true that computer softwares are used to process data and 
verily customers account details in the banking sectors. 
These computers need vigorous software security because 
any little compromise by the system like the banking 
Automatic Teller Machine (ATM) application, can lead to 
loss of large amount of money which can create problem 
for the banks and their customers. 

Meanwhile for a very long time the banking sectors have 
been using account number, account name and customers 
signature for account verification and authentication. 
These methods of verification and authentication of bank 
customers has make banking operation to be very easy for 
the elite and highly difficult for the non-elite and have so 
many challenges in securing the customers data and 
money. This is true because, people can easily copy 
someone account number, forge his/her signature to 
commit fraud on that persons account. Also many people 
who are not familiar with the concept of PIN and account 
number are unlikely to memorize and recognize it [4], this 
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is mainly applicable to the non-educated customers, these 
has made many aged people mainly the non educated ones 
not to be making use of banks in making their transaction 
still we are talking about cashless society. The truth is 
that if we must attain the level of cashless society, every 
body (both educated and non-educated) must make use of 
the banking transaction thus its operations and method 
must be made simple to access and use. In light of the 
above, the banking sector have be making more efforts in 
introducing biometrics as a means of customers account 
verification and authentication. Recently the central banks 
of Nigeria make it mandatory for all bank customers to 
register their biometric information with their respective 
banks. However these biometric are not used yet as a 
meant of account verification and authentication. 
Meanwhile biometric is the utilization of physiological 
characteristics to differentiate an individual. It utilizes 
biological characteristics or behavioral features to 
recognize an individual. It is a new way to verity 
authenticity [3]. The reason why biometric is gaining more 
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attention in the banking sectors is because if used as a 
means of identification it will enhance information security 
and encourages many (both educated and non educated) 
customers to perform their transactions using the banking 
services. However, there are challenges and opportunity 
associated with the use of biometric fingerprint as a means 
of account verification and authentication. This paper 
therefore presents most of the common challenges and 
opportunities associated with using biometric fingerprint 
as a means of account verification and authentication in 
the banking sectors. Similarly the paper presents some of 
the solutions that can be given to these challenges, if 
biometric fingerprint account verification and 
authentication must see the light of the day. 
BIOMETRIC FINGERPRINT 

Biometric fingerprint are unique to every human. They are 
generations of numerous ridges and valleys on the surface 
of human figure. A finger print is the flows of ridges 
patterns in tip of the finger. Among all biometric traits, 
fingerprint has one of the highest levels of reliability [5]. In 
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the rapid growth of information security, fingerprints are 
highly used to secure information system and are highly 
reliable. These make many researchers agitating for the 
full use of this technology in securing information in 
different sectors. Finger print has so many application like 
banking security, ATM security, card transaction, physical 
access control, voting, identification of criminals as 
recorded by [6]. 

Similarly [7], shows in his work how a finger print can be 
used to control examination screening. The possibilities of 
using fingerprint to perform verification and 
authentication is determine by the pattern of ridges and 
furrows as well as the minutes points. It is also possible 
and highly secure to use fingerprint in electronic voting 
system as noted by [8]. 
METHODOLOGY 

The banking sector manage large amount of customers 
data hence there is need to uniquely identify a particular 
customer for optimal operation and for security purpose. 
This brought the idea of using account number, signature, 
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and name and possibly PINS to identify the individual. 
However because of the changes in our society, banks 
application needs more security methodology than the 
ones mentioned above, hence, the need for biometric 
verification system cannot be under estimated. The 
question is how can we use biometric fingerprint to secure 
customers information in the banking sector? 
The fingerprint scanner will be used to collect customers 
fingerprint sample with the aid of a well designed banking 
application and be stored in the application database. The 
application will have extended graphical user interface 
that adopt biometric fingerprint access control techniques. 
Whenever a customer needs his/her account details, 
he/she will place fingerprint on the scanner provided and 
the finger print image at that point will be capture and 
compare with the available fingerprint images in the 
system database, to ascertain if matches exist, if there is 
matches, the system will display the information 
corresponding to that fingerprint images as seen in the 
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database else an error message will be display to the 

system user. 

CHALLENGES 

There are many challenges in using biometric fingerprint 
as a means of account verification and authentication in 
the banking sector. 

Allowing Artificial Fingerprint: Many fingerprint system, 
find it very difficult in detecting artificial finger print as 
noted by [3]. This is a serious challenge in using biometric 
system in banking as artificial finger print can be used to 
trick the biometric application software and still give 
access to the user, it is therefore a serious challenge for 
researcher to look for another best alternative in 
supporting fingerprint system in detecting artificial 
fingerprint. 

Fingerprint Image Processing Resources: Fingerprint 
images require large amount of computer resources before 
it can be successfully processed. When this technology is 
employed in banking application, without finding solution 
to the large number of computer resources need to store 
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and process fingerprint images, the entire system will be 
slow and the performance of the system will not be 
encouraging at all. 

Processing Fingerprint Images: Processing images across 
the network is always time consuming thus the need for 
methods of comparing and processing fingerprint images 
without actually using the complete image but some of the 
image vital properties will help to improve the processing 
speed of biometric images mainly when used in the 
banking sectors where customers satisfaction and quick 
responds is its watch words. 

Scanner Software Development Kits: Fingerprint 
scanner has a kit that must be used during the 
application development, when this kit does not agree with 
the used technology during software development, there is 
always a serious problem. 

Registration Process: Sometimes it may take many swipe 
of fingerprint to register [3]. Thus, there is need of 
improved methods of performing quick registration using 
this kind of system. 
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Society Effect to Human Fingerprints: The performance 
of fingerprint system for identification and authentication 
of customers' record in the banking sector is highly 
affected by the surface of the individual fingerprint. Some 
people do not have fingerprint, some people chemical has 
affected their fingerprint, and some has cuts on their own, 
all these poses a lot of challenges in using fingerprint for 
account verification and validation. 
OPPORTUNITIES 

Security: Biometric provide strong security to system that 
need strong security and authentication. Awasthi and 
Ingolikar (2013) noted that biometric provide a more 
reliability than other traditional authentication 
component. Using biometric for account verification and 
authentication will provide strong security to the system, 
operation. 

Cashless Society: When biometric is used for managing 
customers account, it will encourage both educated and 
non-educated to make use of bank services since 
customers do not need to memorize account number or 
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signature before accessing their account details and it will 
help to achieve cashless society in Nigeria. With this a 
large number of people will be involved in using banking 
services even the old man in the remote village. 
PIN-less Society: Using biometric for account verification 
and authentication will eliminate the use of PIN in 
accessing account details, since when this PIN is stolen 
the financial information of that customer is in serious 
risk. 

Uniqueness of fingerprint: Fingerprint is unique to all 
human. Even, no twins in the world have the same 
fingerprint making the fingerprint technology credibly 
secure for account verification and authentication. 
Reduction of Cash Theft: Biometric system will help to 
reduce if not totally eliminating cash theft since the real 
account owner must be present before the account 
information can be access and transaction made on the 
account. 

Convenience: Biometric systems are convenient in 
environment where access privileges are necessary. 
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Biometric account verification will make account owners to 
be moving around with their account details without 
holding additional electronic device with them. This 
convenience alone is a great opportunity in biometric 
system. 

Estimating Passwords Administrator Cost: The cost of 
administrating and controlling password will be totally 
eliminated with biometric system in account verification 
and authentication. 
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CONCLUSION 

Biometric is gaining interest and attention in many fields 
of human endeavour to providing strong security to 
systems used in different fields. This is also applicable in 
today's banking sectors where people are agitating for the 
full implementation of biometric as means of account 
verification and authentication. This paper present a short 
introduction of biometric techniques in securing systems 
with more emphasis on how it can be used to secure 
customer account information in the banking sectors. The 
challenges and opportunities of using biometrics in 
banking application were also discussed in this paper. It is 
important to be address yet has some opportunities that 
must not be under estimated. The information provided in 
this paper will help to give a guide to the full 
implementation of biometric account verification system in 
Nigeria banking sectors. 
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